A real-time lipreading LSI for word recognition

K. Nakamura1, N. Murakami2, K. Takagi2, N. Takagi2
1Center for Information Media Studies, University of Nagoya, Nagoya, Japan
2Department of Information Engineering, University of Nagoya, Nagoya, Japan

Tóm tắt

In the paper, we present a real-time lip-reading LSI for recognizing spoken words from lip movement. The LSI recognizes up to 8 words based on the hidden Markov model (HMM). The LSI accepts the 256/spl times/256 8-bit gray-scale images from a camera, and outputs the 3-bit symbol code of words for 43 images (corresponding to 1.53 s). We present a lip-reading algorithm optimized for hardware implementation. We have designed the lip-reading LSI and fabricated a 4.9 mm/spl times/4.9 mm chip using 0.35 /spl mu/m process via VDEC Rohm. The LSI performs real-time recognition at 40 MHz operation.

Từ khóa

#Large scale integration #Hidden Markov models #Image edge detection #Gray-scale #Cameras #Speech recognition #Humans #Vector quantization #Hardware design languages #Image recognition

Tài liệu tham khảo

sato, 1997, Lip Feature Extraction and Analysis for Lip Reading JCMI'97 duchnowski, 1994, See Me Hear Me Integrating Automatic Speech Recognition and Lipreading (CSLP'94 rabiner, 1993, Fundamentals of speech recognition 10.1142/S0218001401000770 adjounani, 1995, Audio-vidual Speech Recognition Compared Acmss Two Architectures Eurospeech'95