Language models beyond word strings

E. Noth1, A. Batliner1, H. Niemann1, G. Stemmer1, F. Gallwitz2,1, J. Spilker3,1
1Universität Erlangen-Nürnberg, Lehrstuhl für Mustererkennung (Informatik 5), Erlangen, Germany
2Lehrstuhl für Künstliche Intelligenz (Inforrnatik 8), Datev, Germany
3Sympalog Speech Technologies, Germany

Tóm tắt

In this paper we want to show how n-gram language models can be used to provide additional information in automatic speech understanding systems beyond the pure word chain. This becomes important in the context of conversational dialogue systems that have to recognize and interpret spontaneous speech. We show how n-grams can: (1) help to classify prosodic events like boundaries and accents; (2) be extended to directly provide boundary information in the speech recognition phase; (3) help to process speech repairs; and (4) detect and semantically classify out-of-vocabulary words. The approaches can work on the best word chain or a word hypotheses graph. Examples and experimental results are provided from our own research within the EVAR information retrieval system and the VERBMOBIL speech-to-speech translation system.

Từ khóa

#Speech recognition #Speech processing #Speech analysis #Databases #Natural languages #Event detection #Phase detection #Stochastic systems #Automatic speech recognition #Virtual manufacturing

Tài liệu tham khảo

batliner, 0, The Prosody Module, Wahlster [9], 106 10.3115/992730.992816 10.1016/S0167-6393(98)00037-5 10.1109/89.861370 10.1017/CBO9781139173438 jekat, 1995, Dialogue Acts in Verbmobil, Verbmobil Report 65 mast, 1995, Criteria for the Segmentation of Spoken Input into Individual Utterances, Verbmobil Report batliner, 1999, Automatic Annotation and Classification of Phrase Accents in Spontaneous Speech, Proc European Conf on Speech Communication and Technology, 1, 519, 10.21437/Eurospeech.1999-134 10.1109/TASSP.1987.1165125 10.1016/B978-0-08-051584-7.50045-0 10.1109/ICSLP.1996.607083 gallwitz, 1998, The Erlangen Spoken Dialogue System EVAR: A State-of-the-art Information Retrieval System, Proc of the 1998 Int Symposium on Spoken Dialogue (ISSD 98), 19 spilker, 2001, Behandlung spontansprachlicher Reparaturen in einem Sprachverarbeitungssystem 10.3115/1075527.1075614 hetherington, 1995, A characterization of the problem of new out-of-vocabulary words in continuous-speech recognition and understanding 10.1016/S0167-6393(02)00079-1 10.1007/3-540-63580-7 fetter, 1998, Detection and Transcription of Out-of-Vocabulary Words in Continuous-Speech Recognition pallet, 1995, 1994 Benchmark Tests for the ARPA Spoken Language Program, Proc ARPA Spoken Language Systems Technology Workshop jelinek, 1997, Statistical Methods for Speech Recognition wahlster, 2000, Verbmobil Foundations of speech-to-speech translations, 10.1007/978-3-662-04230-4 10.1109/PROC.1976.10159 gallwitz, 0, Integrated Stochastic Models for Spontaneous Speech Recognition Studien zur Mustererkennung jelinek, 1999, Putting Language Into Language Modeling, Proc European Conf on Speech Communication and Technology Budapest, 1, kn-1 10.1006/csla.1993.1010 spilker, 2001, sHow to Repair Speech Repairs in an End-to-End System, Proc ISCA Workshop on Disflueny in Spontaneous Speechs, 73 10.1016/S0167-6393(01)00027-9 brown, 1990, A Statistical Approach to Machine Translation, Computational Linguistics, 16, 79 10.1016/0010-0277(83)90026-4