MLLR adaptation techniques for pronunciation modeling
Tóm tắt
Multiple regression class MLLR (maximum likelihood linear regression) transforms are investigated for use with pronunciation models that predict variation in the observed pronunciations given the phonetic context. Regression classes can be constructed so that MLLR transforms can be estimated and used to model specific acoustic changes associated with pronunciation variation. The effectiveness of this modeling approach is evaluated on the phonetically transcribed portion of the SWITCHBOARD conversational speech corpus.
Từ khóa
#Maximum likelihood linear regression #Automatic speech recognition #Predictive models #Dictionaries #Natural languages #Speech processing #Context modeling #Speech analysis #Surface treatment #Decision treesTài liệu tham khảo
10.1109/ICASSP.1998.674431
digalakis, 1997, Development of dialect-specific speech recognizers using adaptation methods, Proceedings of the IEEE ICASSP, 1455
10.1016/S0167-6393(99)00037-0
greenberg, 0, The Switchboard Transcription Project, Research Report 24 1996 Large Vocabulary Continuous Speech Recognition Summer Research Workshop Technical Report Series Center for Language and Speech Processing Johns Hopkins University
10.1006/csla.1995.0010
2000, NIST Evaluation Plan for Recognition of Conversational Speech over the Telephone
saraclar, 2000, Pronunciation Modeling
young, 2000, The HTK Book (Version 3 0)
saraclar, 1999, Pronunciation ambiguity vs pronunciation variability in speech recognition, Proc EUROSPEECH, 515
byrne, 1998, Pronunciation modeling using a hand-labelled corpus for conversational speech recognition, Proc ICASSP, 313
riley, 1995, Automatic generation of detailed pronunciation lexicons, Automatic Speech and Speaker Recognition Advanced Topics