Arabic Natural Language Processing


Arab Diac©, Arab Morpho©, Arab Tagger©,

Arabic Lexical Semantic Analysis, Swift©, Arab Dictions©

 

ArabDiac©

RDI's Automatic Arabic Phonetic Transcriptor (Diacritizer/Vowelizer)

 

This large-scale technology takes input crude Arabic text and produces the corresponding fully diacritized text with a word accuracy rate exceeding 96%, which is vital for Arabic speech technologies; e.g. Arabic TTS.

 

To achieve such a mission; ArabDiac© is built over RDI's Arabic NLP infrastructure (see below) esp. RDI's Arabic Morphological Analyzer, Arabic PoS Tagger, Arabic Phonetic Grammar, Arabic Text Normalizer ... etc. as rule-based language factorizers. along with extensive statistical processing.


To contexually disambiguate the resulting multiple possible factorizations (analyses) for each word in i/p text, extensive dual-mode statistical techniques hybridizing its operation back and forth on linguistically factorized and un-factorized entity sequences.

 

Try Online...

 

Fore more detailed info, download...

 

Essay on the Anatomy of RDI's Arabic Diacritizer

(Arabic PDF, 457 KB)

 

Presentation on the Anatomy of RDI's Arabic Diacritizer

(WinZipped Arabic material, 3,940 KB)

 

Presentation on the Anatomy of RDI's Arabic Diacritizer

(English PPS, 1,509 KB)

 

Paper on Arabic Phonetic Grammar for Arabic Diacritization

(English PDF, 140 KB)

 

Paper on Arabic PoS Tagging for Arabic Diacritization

(English PDF, 234 KB)

 

Arabic Diacritization via a Hybrid of Factorizing and Un-factorizing Statisitical Disambiguators

(English PDF, 1421 KB)


ArabMorpho©

 RDI's Arabic Morphological Analyzer

 

This main RDI’s NLP core engine is the basis of Arabic morphological analysis, Arabic PoS tagging, and Arabic Lexical Semantic Analysis. ArabMorpho© is a morpheme-based lexical analyzer/synthesizer which distinguishes it from its vocabulary-based rivals, that boosts its flexibility and coverage beyond 99.8%.

 

After morphological rules are exhausted, deep-horizon dynamic statistical analysis is deployed to realize disambiguation word accuracy rate exceeding 96%.

Try Online...

 

Fore more detailed info, download...

 

Flash movie on RDI's Arabic Morphological Analyzer

(Arabic Narrated; EXE, 2,400 KB)

 

Master Technical Documentation

(English PDF, 1,992 KB)


ArabTagger©

RDI's Arabic Part-of-Speech Tagger

 

Arabic PoS tags are the essential input features for many fundamental Arabic NLP process; e.g.'s Syntax Analysis, Diacritization, ..

 

The underlying compact Arabic PoS tags set of RDI's ArabTagger© is originally designed to comply with Arabic syntax and morphology, which is a major distinctive feature of this engine over its rivals.

Try Online...

 

Fore more detailed info, download...

 

Paper on Arabic PoS Tagging

(English PDF, 234 KB)


Arabic Lexical Semantic Analysis

RDI's Arabic Lexical Semantic Analyzer can do four basic lexical semantic functions:

  • To retrieve the Arabic words under a a given semantic field belonging to a predefined closed set of global semantic fields; i.e. Forward Mapping.

  • To map any given Arabic word to one or many of the aforementioned pre-defined semantic fields; i.e. Backward Mapping.

  • To retrieve the semantic relation(s) between a given pair of the semantic fields.

  • To infer the semantic fields related with a given semantic relation to a given semantic field.

These basic functions can be indirectly deployed to infer the semantic relation(s) between a given pair of Arabic words, and/or to retrieve the words related with a given semantic relation to a given Arabic word. These operations are vital ones for many applications; IR, Text Mining, MT ... etc.

 

With tens of semantic relations being defined, and with Arabic NLP tools handling the highly derivative and inflecive nature of Arabic, this system is heading to be the best Arabic Lexical Semantic Analyzer regarding the coverage.

 

For more info ...

 

LREC2008_RDI's Arabic Lexical Semantics Language Resource Paper

(PDF, 602 KB)

 

Architecture of the Arabic lexical semantic analyzer

(PDF, 665 KB)

 

Building_the_forward_Arabic_Lexical_Semantic_DB

(PDF, Arabic Content, 255 KB)


Swift©

RDI's Arabic Text Search Engine

 

This is RDI's Arabic derivative text search engine based on RDI's Arabic morphological analyzer; ArabMorpho©. Search may be done at the root, pattern, or word level, where single and multiple search queries with various neighborhoods are possible.

While Swift© Indexing Server can index up to 4G words, Swift© Searching Server can handle multi-threaded search queries. SDK's are available for web, MS-Windows, Linux, .., and any other OS's.

Try Online...

 

Fore more detailed info, download...

 

Flash movie on RDI's Arabic Text Search Engine

(Arabic Narrated; EXE, 5,636 KB)

 

White paper on Swift©

(English/Arabic DOC, 112 KB)


ArabDictions©

RDI's Arabic Lexical Dictionaries

 

Given a text that is morphologically analyzed using ArabMorpho©, the root, the pattern, the prefix, and the suffix of any word of this text can be automatically bound to its corresponding dictionary entry.

 

ArabDictions©'s rich Arabic dictionary entries make it easier for readers - esp. the junior - to understand Arabic text at all levels.


Try Online...

 

Fore more detailed info, download...

 

Flash movie on RDI's Arabic Dictionaries

(Arabic Narrated; EXE, 2,416 KB)