|
Clever Page©
The community of Arabic-enabled
application-integrators and vendors has been for
long badly seeking for a reliable,
highly-performing Arabic omni font-written OCR
software technology.
Like other OCRs,
such a software takes scanned Arabic paper
documents as its input, and automatically
produces the digital files corresponding to them
as if a typist has edited those paper documents
on a digital computer.
Document
management systems (DMS), libraries
digitization, information retrieval (IR), uni/multi
modal text entry, reading assists for the blind,
.. etc., are just examples of applications that
may benefit from a reliable OCR. Moreover, a
robust Arabic OCR software may also serve other
languages using Arabic script like Persian,
Urdu, ...., etc.
It is remarkable
that OCR systems in general are being developed
since decades, tens of research Arabic OCR
pilots have been produced by the academia, and a
handful Arabic OCR products are even available
in the market. However, a reliable Arabic OCR
software that works on real-life (multi-font,
multi-size, maybe noisy, …) documents at a
practically acceptable average word-error-rate (WER)
within 3% is
yet away from being available in the market!
The early
versions of CleverPage©
are showing excellent results when tried on
numerous documents containing multiple fonts and
sizes. In fact, these results are the best
reported ones in the published literature to
date regarding Arabic omni font-written OCR.
The long history
of RDI esp. Prof. Mohsen A. A. Rashwan, and Dr.
Mohamed Attia in this field including numerous
MSc. and PhD theses, published papers in
international conferences and journals,
implemented pilots, and an international patent,
are all a strong basis establishing for the
prospected success of this new core technology
from RDI.
For more detailed info...
Arabic OCR System Analogous
to HMM-Based ASR Systems_Dec.2007
(PDF, 429KB)
A Large Scale HMM-Based Omni Front-Written OCR System for
Cursive Scripts: PhD thesis by Mohamed S. M. El-Mahallaway,
Cairo University, April 2008
(PDF, 2, 652 KB)
Our system; results & conclusions
(PPS, 5.135 KB)
Competitors analysis of commercial Arabic OCR's.
(Arabic PDF, 728 KB)
Presentation on RDI 's lines & words decomposition algorithm.
(PPS, 1,888KB)
Research history of RDI on Arabic OCR.
(PDF, 116 KB)
|