Arabic handwritten document preprocessing and recognition
Title | Arabic handwritten document preprocessing and recognition |
Publication Type | Conference Paper |
Year of Publication | 2015 |
Authors | Chammas, E., Mokbel, C., Likforman-Sulem, L. |
Conference Name | 2015 13th International Conference on Document Analysis and Recognition (ICDAR) |
Keywords | Arabic Handwriting Recognition, Arabic handwritten document preprocessing, Arabic handwritten document recognition, deskewing, document detection, document image processing, guideline detection approach, Guideline removal, guideline removal preprocessing, handwritten character recognition, Handwritten Document preprocessing, Hidden Markov models, image denoising, image recognition, image restoration, image segmentation, k-means, keystroke restoration, line fragment removal, noise effect reduction, noise removal, OpenHaRT database, Optical imaging, Optical reflection, pubcrawl170115, text detection, Text recognition, text-line level preprocessing, Textline image Preprocessing, Writing |
Abstract | Arabic handwritten documents present specific challenges due to the cursive nature of the writing and the presence of diacritical marks. Moreover, one of the largest labeled database of Arabic handwritten documents, the OpenHart-NIST database includes specific noise, namely guidelines, that has to be addressed. We propose several approaches to process these documents. First a guideline detection approach has been developed, based on K-means, that detects the documents that include guidelines. We then propose a series of preprocessing at text-line level to reduce the noise effects. For text-lines including guidelines, a guideline removal preprocessing is described and existing keystroke restoration approaches are assessed. In addition, we propose a preprocessing that combines noise removal and deskewing by removing line fragments from neighboring text lines, while searching for the principal orientation of the text-line. We provide recognition results, showing the significant improvement brought by the proposed processings. |
URL | https://ieeexplore.ieee.org/document/7333802 |
DOI | 10.1109/ICDAR.2015.7333802 |
Citation Key | chammas_arabic_2015 |
- image segmentation
- Writing
- Textline image Preprocessing
- text-line level preprocessing
- Text recognition
- text detection
- pubcrawl170115
- Optical reflection
- Optical imaging
- OpenHaRT database
- noise removal
- noise effect reduction
- line fragment removal
- keystroke restoration
- k-means
- Arabic Handwriting Recognition
- image restoration
- image recognition
- image denoising
- Hidden Markov models
- Handwritten Document preprocessing
- handwritten character recognition
- guideline removal preprocessing
- Guideline removal
- guideline detection approach
- document image processing
- document detection
- deskewing
- Arabic handwritten document recognition
- Arabic handwritten document preprocessing