Maximum likelihood modelling of pronunciation variation

Speech Communication - Tập 29 - Trang 177-191 - 1999

Trym Holter¹, Torbjørn Svendsen²

¹Department of Signal Processing and Systems Design, SINTEF Telecom and Informatics, O.S. Bragstads plass 2, 7465 Trondheim, Norway

²Department of Telecommunications, Norwegian University of Science and Technology, Norway

Tài liệu tham khảo

Asadi, A., Schwartz, R., Makhoul, J., 1991. Automatic modeling for adding new words to a large vocabulary speech recognition system. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Toronto, Canada, pp. 305–308

Bacchiani, M., Ostendorf, M., 1998. Joint acoustic design and lexicon generation. In: Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition. ESCA, Rolduc, The Netherlands. pp. 7–12

Bahl, L.R., Das, S., deSouza, P.V., Epstein, M., Mercer, R.L., Merialdo, B., Nahamoo, D., Picheny, M.A., Powell, J., 1991. Automatic phonetic baseform determination. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Toronto, Canada, pp. 173–176

Bahl, 1993, A method for the construction od acoustic markov models for words, IEEE Trans. Speech Audio Process., 1, 442, 10.1109/89.242490

Cohen, M., 1989. Phonological structures for speech recognition. Ph.D. Thesis, University of California, Berkeley

Gillick, L., Cox, S.J., 1989. Some statistical issues in the comparison of speech recognition algorithms. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Glasgow, Scotland, pp. 532–535

Haeb-Umbach, R., Beyerlein, P., Thelen, E., 1995. Automatic transcription of unknown words in a speech recognition system. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Detroit, USA, pp. 840–843

Holter, T., 1997. Maximum likelihood modelling of pronunciation in automatic speech recognition. Ph.D. Thesis, Norwegian University of Science and Technology

Holter, T. and Svendsen, T., 1996. A comparison of lexicon-building methods for subword-based speech recognisers. In: Proceedings of the IEEE Region 10 Conference on Digital Signal Processing (TENCON). IEEE, Perth, Australia, pp. 102–106

Holter, T., Svendsen, T., 1997a. Combined optimisation of baseforms and model parameters in speech recognition based on acoustic sub-word units. In: Proceedings of the 1997 IEEE Workshop on Speech Recognition and Understanding. IEEE, Santa Barbara, USA, pp. 199–206

Holter, T., Svendsen, T., 1997b. Incorporating linguistic knowledge and automatic baseform generation in acoustic subword based speech recognition. In: Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH). Rhodes, Greece pp. 1159–1162

Klatt, 1987, Review of text-to-speech conversion for English, J. Acoust. Soc. Amer., 82, 737, 10.1121/1.395275

Lee, C.-H., Soong, F.K., Juang, B.-H., 1988. A segment model approach to speech recognition. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, New York, pp. 501–504

Lee, C.-H., Juang, B.-H., Soong, F.K., Rabiner, L.R., 1989. Word recognition using whole word and subword models. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Glasgow, Scotland, pp. 683–686

Linde, 1980, An algorithm for vector quantizer design, IEEE Trans. Commun., 28, 84, 10.1109/TCOM.1980.1094577

Lu, 1978, A sentence-to-sentence clustering procedure for pattern analysis, IEEE Trans. Syst. Man Cybernet. SMC, 8, 381, 10.1109/TSMC.1978.4309979

Lucassen, J.M., Mercer, R.L., 1984. An information theoretic approach to the automatic determination of phonemic baseforms. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, San Diego, USA, pp. 42.5.1–42.5.4

Mokbel, H., Jouvet, D., 1998. Derivation of the optimal phonetic transcription set for a word from its acoustic realisations. In: Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition. ESCA, Rolduc, The Netherlands, pp. 73–78

Nilsson, 1971

NIST Speech Disc 2-4.2, 1992. Resource Management continuous speech database (RM1) – Development test and evaluation test data and scoring software

Paliwal, K.K., 1990. Lexicon-building methods for an acoustic sub-word based speech recognizer. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Albuquerque, USA, pp. 729–732

Price, P., Fisher, W.M., Bernstein, J., Pallet, D.S., 1988. The DARPA 1000-word Resource Management database for continuous speech recognition. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, New York, pp. 651–654

Ramabhadran, B., Bahl, L.R., deSouza, P.V., Padmanabhan, M., 1998. Acoustics-only based automatic phonetic baseform generation. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Seattle, USA, pp. 309–312

Sloboda, T., 1995. Dictionary learning: Performance through consistency. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Detroit, USA, pp. 453–456

Sloboda, T., Waibel, A., 1996. Dictionary learning for spontaneous speech recognition. In: Proceedings of the International Conference on Spoken Language Processing (ICSLP). Philadelphia, USA. pp. 2328–2331

Soong, F.K., Huang, E.-F., 1991. A tree-trellis based fast search for finding the N best sentence hypotheses in continuous speech recognition. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Toronto, Canada, pp. 705–708

Strik, H., Cucchiarini, C., 1998. Modeling pronunciation variation for ASR: Overview and comparison of methods. In: Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition. ESCA, Rolduc, The Netherlands, pp. 137–144

Svendsen, T., Paliwal, K.K., Harborg, E., Husøy, P.O., 1989. An improved sub-word based speech recognizer. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Glasgow, Scotland, pp. 108–111

Svendsen, T., Soong, F.K., Purnhagen, H., 1995. Optimizing baseforms for HMM-based speech recognition. In: Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH). Madrid, Spain, pp. 783–786

Wilpon, J.G., Juang, B.-H., Rabiner, L.R., 1987. An investigation on the use of acoustic sub-word units for automatic speech recognition. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Dallas, USA, pp. 821–824

Young, S.J., Jansen, J., Odell, J., Ollason, D., Woodland, P., 1993. HTK: Hidden Markov Model Toolkit V1.5. Cambridge University Engineering Department Speech Group and Entropic Research Laboratories

Zhao, 1993, A speaker-independent continuous speech recognition system using continuous mixture Gaussian density HMM of phoneme-sized units, IEEE Trans. Speech Audio Process., 1, 345, 10.1109/89.232618

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA