Script and nature differentiation for Arabic and Latin text images
Tóm tắt
A method for Arabic and Latin text block differentiation for printed and handwritten scripts is proposed. This method is based on a morphological analysis for each script at the text block level and a geometrical analysis at the line and the connected component level. In this paper, we present a brief survey, of existing methods used for scripts differentiation as well as a general characteristics of Arabic and Latin scripts. Then, We describe our method for the differentiation of these last scripts. We finally show two experimental results on two different data sets. 400 text blocks constitute the first one and 335 text blocks compose the second.
Từ khóa
#Text analysis #Handwriting recognition #Laboratories #Machine intelligence #Optical character recognition software #Natural languages #Optical devices #Optical sensors #Conferences #Feature extractionTài liệu tham khảo
10.1109/ICIP.1995.537663
10.1109/34.689305
tao, 2001, Discrimination of oriental and euramerican scripts using fractal feature, ICDAR'01, 1115
10.1142/S0218001498000063
10.1109/34.584100
10.1109/ICDAR.2001.953956
10.1109/ICDAR.1999.791873
bennasri, 2000, Arabic script preprocessing and application to postal addresses, 74
lee, 1996, Language identification in complex, unoriented, and degraded document images, DAS'1996, 76
kanoun, 2000, Une approche de discrimination arabe /latin, imprime? /manuscrit, CIFED'2000, 121
hochberg, 1999, Script and language identification for handwritten document images, IJDAR, 2, 45, 10.1007/s100320050036
10.1109/34.574802
10.1016/S0031-3203(97)00143-X
10.1109/ICDAR.2001.953896
kanoun, 2000, Script identification for arabic and latin, printed and andwritten documents, DAS 2000, 159