On-line handwriting recognition using character bigram match vectors

A. El-Nasan1, M. Perrone2
1Rensselaer polytechnic Institute, Troy, NY, USA
2IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA

Tóm tắt

Describes an adaptive, partial-word-level, writer,dependent, handwriting recognition system that utilizes the character n-gram statistics of the English language. The system exploits the linguistic property that very few pairs of English words share exactly the same set of character bigrams. This property is used to bring linguistic context to the recognition stage. The recognition is based on, estimating the probability of bigram co-occurrences between words. Preliminary experiments using naive features and limited training sets show that the system can recognize over 60% of words it has never seen before in handwritten form. The system has only few trainable parameters. In addition, incremental training is computationally inexpensive.

Từ khóa

#Handwriting recognition #Character recognition #Statistics #Hidden Markov models #Probability #System testing #Natural languages #Text recognition #Context modeling #Optical character recognition software

Tài liệu tham khảo

hong, 0, Character segmentation using visual inter-word constraints in a text page, Proc SPIE-Int Soc Opt Eng, 2422, 15 10.1109/ICDAR.2001.953773 10.1109/TPAMI.1979.4766904 10.1109/ICPR.2000.906139 10.1145/375360.375365 10.1109/TPAMI.1983.4767408 10.1109/TPAMI.1982.4767297 10.1109/ICDAR.1995.599031 10.1109/TIT.1967.1054060 kucera, 1967, Computational analysis of present-day american english, Providence 10.1109/TPAMI.1980.4766998 10.1109/TPAMI.1979.4766902