ASR in portable wireless devices

O. Viikki1
1Speech and Audio Systems Laboratory, Nokia Research Center, Tampere, Finland

Tóm tắt

This paper discusses the applicability and role of automatic speech recognition in portable wireless devices. Due to the author's background, the viewpoints are somewhat biased to mobile telephones, but many of the aspects are nevertheless common for other portable devices as well. While still dominated by the speaker-dependent technology, there are today signs that also in wireless devices, there are ASR trends towards speaker-independent systems. As these modern communication devices are usually intended for mass markets, the paper reviews the ASR areas that are relevant for speech recognition on low cost embedded systems. In particular, multilingual ASR, low complexity ASR algorithms and their implementation, and acoustic model adaptation techniques play a key role in enabling cost effective realization of ASR systems. Low complexity and advanced noise robust ASR algorithms are sometimes conflicting concepts. The paper also briefly reviews some of the most important noise robust ASR techniques that are well suited for embedded systems.

Từ khóa

#Automatic speech recognition #Costs #Embedded system #Acoustic noise #Noise robustness #Telephony #Mobile communication #Speech recognition #Acoustic devices #Adaptation model

Tài liệu tham khảo

mohri, 1997, Finite-state transducers in language and speech processing, Computational Linguistics, 23 pearce, 2000, Enabling new. speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition, Proceedings of AVI'00 rigazio, 2001, Joint environment and speaker adaptation, Proc ISCA ITR Workshop Adaptation Methods Speech Recognition 10.1109/ICASSP.2000.862089 rosenberg, 1994, Cepstral channel normalization techniques for HMM based speaker verification, Proc of ICSLP'94 10.1109/ASRU.1997.659116 10.1109/ICASSP.1995.479289 steinbiss, 1994, Improvements in beam search, Proc of ICSLP'94 tian, 2001, Pronunciation and acoustic model adaptation for improving multilingual speech recognition, Proc ISCA ITR Workshop Adaptation Methods Speech Recognition vasilache, 0, Speech recognition using HMMs with quantized parameters, Proc of ICSLP'00 2000 10.1016/0167-6393(94)00059-J goronzy, 2001, Generating non-native pronunciation variants for lexicon adaptation, Proc ISCA ITR Workshop Adaptation Methods Speech Recognition 10.1109/89.326616 haavisto, 1998, Audio-visual signal processing for mobile communications, Proc of EUS/PCO'98 junqua, 1996, Robustness in Automatic Speech Recognition - Fundamentals and Applications iso-sipilä, 1999, Hands-free voice activation in noisy car environment, Proc of EUROSPEECH'99, 10.21437/Eurospeech.1999-520 10.1109/89.279278 leggetter, 1994, Speaker adaptation of HMMs using linear regression, Proc ICLSP'94 10.1109/89.536929 vasilache, 2001, Speaker adaptation of quantized parameter HMMs, Proc of Eurospeech'01, 10.21437/Eurospeech.2001-328 10.1109/ICASSP.2001.940753 10.1109/ICASSP.1998.675369 1999, The International Phonetic Association, Handbook of the International Phonetic Association woodland, 1999, Speaker Adaptation: Techniques and Challenges, Proc IEEE ASRU Workshop 0, Nokia Annual Report 2000