An examination of three classes of ASR dialogue systems: PC-based dictation, in-car systems and automated directory assistance

M.J. Hunt1
1Phonetic Systems UK Limited, Cheltenham, UK

Tóm tắt

Three classes of practical speech recognition dialogue systems are considered, starting with PC-based systems, specifically dictation systems. Although such systems have become very effective, they have not achieved mainstream use. Some reasons for this disappointing outcome are proposed. Speech recognition is now appearing in production cars. It is argued that the two most attractive in-car applications are for navigation systems and for dialing-by-name. The latter may be more suited to equipment that can be detached from the car and connected to a PC. After considering telephone applications in general, the importance of automated DA (directory assistance - also called directory enquiries or DQ in some countries) is established and its particular challenges are discussed. Among these are the size and dynamic nature of the databases accessed, and the variations produced by callers in naming a commercial/administrative entity whose number they are seeking. The advantages of a bottom-up phonetic speech recognition technique for automated DA are described. It is concluded that the combination of this technique and automatic methods for handling name variation makes automated DA, including access to business listings, a practical proposition.

Từ khóa

#Automatic speech recognition #Speech recognition #Marketing and sales #Navigation #Telephony #Databases #Business #Application software #Computerized monitoring #Automatic control

Tài liệu tham khảo

lowerre, 1980, The Harpy Speech Understanding System, Trends in Speech Recognition, 340 pallett, 1998, 1997 Broadcast News Benchmark Test Results: English and Non-English, Proc Broadcast News Transcription and Understanding Workshop, 5 morgan, 1999, Temporal Signal Processing for ASR, Proc IEEE Workshop Automatic Speech Recognition and Understanding (ASRU) hunt, 1999, Spectral Signal Processing for ASR, Proc IEEE Workshop Automatic Speech Recognition and Understanding (ASRU) lee, 1989, Automatic Speech Recognition - The Development of the SPHINX System juang, 1985, Mixture, autoregressive hidden Markov models for speech signals, IEEE Trans Acoustics Speech Signal Proc, assp 33, 1404, 10.1109/TASSP.1985.1164727 wolf, 1980, The HWIM Speech Understanding System, Trends in Speech Recognition, 316 10.1109/ICASSP.1999.758060 10.1109/ICASSP.2000.860213 hunt, 1999, Some Experience in In-Car Speech Recognition, Proc Workshop Robust Methods for Speech Recognition in Adverse Conditions, 25 0, Information provided by The Kelsey Group 0, Aurora project website 0, Smada project website 2001, European Directory Assistance Markets The Pelorus Group 10.1016/S0167-6393(97)00056-3 baker, 1989, DragonDictateTM-30K: Natural Language Speech Recognition with 30,000 Words, Proc ESCA European Conference on Speech Communioation and Technology Eurospeech 89, 2 popovici, 0, Learning of User Formulations for Business Listings in Automatic Directory Assistance, Proc Eurospeech 2001, 2325