MLLR adaptation techniques for pronunciation modeling

U. Venkataramani1, W. Byrne1
1Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD, U.S.A.

Tóm tắt

Multiple regression class MLLR (maximum likelihood linear regression) transforms are investigated for use with pronunciation models that predict variation in the observed pronunciations given the phonetic context. Regression classes can be constructed so that MLLR transforms can be estimated and used to model specific acoustic changes associated with pronunciation variation. The effectiveness of this modeling approach is evaluated on the phonetically transcribed portion of the SWITCHBOARD conversational speech corpus.

Từ khóa

#Maximum likelihood linear regression #Automatic speech recognition #Predictive models #Dictionaries #Natural languages #Speech processing #Context modeling #Speech analysis #Surface treatment #Decision trees

Tài liệu tham khảo

10.1109/ICASSP.1998.674431 digalakis, 1997, Development of dialect-specific speech recognizers using adaptation methods, Proceedings of the IEEE ICASSP, 1455 10.1016/S0167-6393(99)00037-0 greenberg, 0, The Switchboard Transcription Project, Research Report 24 1996 Large Vocabulary Continuous Speech Recognition Summer Research Workshop Technical Report Series Center for Language and Speech Processing Johns Hopkins University 10.1006/csla.1995.0010 2000, NIST Evaluation Plan for Recognition of Conversational Speech over the Telephone saraclar, 2000, Pronunciation Modeling young, 2000, The HTK Book (Version 3 0) saraclar, 1999, Pronunciation ambiguity vs pronunciation variability in speech recognition, Proc EUROSPEECH, 515 byrne, 1998, Pronunciation modeling using a hand-labelled corpus for conversational speech recognition, Proc ICASSP, 313 riley, 1995, Automatic generation of detailed pronunciation lexicons, Automatic Speech and Speaker Recognition Advanced Topics