Baseline estimation for Arabic handwritten words

M. Pechwitz1, V. Margner1
1Technical University of Braunschweig, Brunswick, Germany

Tóm tắt

Baseline information has been used for diverse purposes in handwriting research. The baseline represents a first orientation in a word and it is often a precondition for subsequent algorithms, including preprocessing tasks, segmentation and feature extraction for recognition systems. Approaches based on the horizontal projection histogram are used for Arabic printed text but they are ill-suited for Arabic handwritten words. In this paper we present a method that is completely based on polygonally approximated skeleton processing. The central algorithm is concerned with finding features in the skeleton and processing linear regression analysis. Our method performs very well as long as the model assumption of one straight line applies. We tested the method on 26459 isolated Tunisian town names written by 411 writers (IFNIENIT-database).

Từ khóa

#Writing #Feature extraction #Handwriting recognition #Image recognition #Skeleton #Character recognition #Communications technology #Histograms #Linear regression #Algorithm design and analysis

Tài liệu tham khảo

amin, 2000, Recognition of printed arabic text based on global features and decision tree learning techniques, Pattern Recognition, 33, 1309, 10.1016/S0031-3203(99)00114-4 10.1016/S0031-3203(97)00084-8 10.1142/S0218001401000848 al-badr, 1995, Survey and bibliography of arabic optical text recognition, Signal Processing, 41, 49, 10.1016/0165-1684(94)00090-M 10.1016/S0031-3203(00)00051-0 10.1016/0734-189X(84)90143-9 10.1109/ICDAR.1999.791902 10.1109/34.771314 10.1016/0031-3203(94)90152-X 10.1109/ICDAR.2001.953864 10.1109/ICDAR.2001.953967 10.1109/ICDAR.2001.953799