Arabic   Search | Contact us
Arabic omni font-written OCR
Arabic NLP
Arabic LR's
Qur'an Tools
Try Online

Clever Page©

The community of Arabic-enabled application-integrators and vendors has been for long badly seeking for a reliable, highly-performing Arabic omni font-written OCR software technology.

Like other OCRs, such a software takes scanned Arabic paper documents as its input, and automatically produces the digital files corresponding to them as if a typist has edited those paper documents on a digital computer.

Document management systems (DMS), libraries digitization, information retrieval (IR), uni/multi modal text entry, reading assists for the blind, .. etc., are just examples of applications that may benefit from a reliable OCR. Moreover, a robust Arabic OCR software may also serve other languages using Arabic script like Persian, Urdu, ...., etc.

It is remarkable that OCR systems in general are being developed since decades, tens of research Arabic OCR pilots have been produced by the academia, and a handful Arabic OCR products are even available in the market. However, a reliable Arabic OCR software that works on real-life (multi-font, multi-size, maybe noisy, …) documents at a practically acceptable average word-error-rate (WER) within 3% is yet away from being available in the market!

The early versions of CleverPage© are showing excellent results when tried on numerous documents containing multiple fonts and sizes. In fact, these results are the best reported ones in the published literature to date regarding Arabic omni font-written OCR.

The long history of RDI esp. Prof. Mohsen A. A. Rashwan, and Dr. Mohamed Attia in this field including numerous MSc. and PhD theses, published papers in international conferences and journals, implemented pilots, and an international patent, are all a strong basis establishing for the prospected success of this new core technology from RDI.

For more detailed info...

Arabic OCR System Analogous to HMM-Based ASR Systems_Dec.2007
(PDF, 429KB)

A Large Scale HMM-Based Omni Front-Written OCR System for Cursive Scripts: PhD thesis by Mohamed S. M. El-Mahallaway, Cairo University, April 2008
(PDF, 2, 652 KB)

Our system; results & conclusions
(PPS, 5.135 KB)

Competitors analysis of commercial Arabic OCR's.
(Arabic PDF, 728 KB)

Presentation on RDI 's lines & words decomposition algorithm.
(PPS, 1,888KB)

Research history of RDI on Arabic OCR.
(PDF, 116 KB)

RDI© - Research and Development International.
Since 1993 - All rights reserved.
Downloads | Jobs