Speech recognition of broadcast news for the European Portuguese language

H. Meinedo1, N. Souto1, J.P. Neto2
1L2F-Spoken Language Systems Laboratory, INESC ID Lisboa, Lisboa, Portugal
2L2F - Spoken Language Syst. Lab., INESC ID Lisboa/IST, Lisboa, Portugal

Tóm tắt

This paper describes our work on the development of a large vocabulary continuous speech recognition system applied to a broadcast news task for the European Portuguese language in the scope of the ALERT project. We start by presenting the baseline recogniser AUDIMUS, which was originally developed with a corpus of read newspaper text. This is a hybrid system that uses a combination of phone probabilities generated by several MLPs trained on distinct feature sets. The paper details the modifications introduced in this system, namely in the development of a new language model, the vocabulary and pronunciation lexicon and the training on new data from the ALERT BN corpus currently available. The system trained with this BN corpus achieved 18.4% WER when tested with the F0 focus condition (studio, planed, native, clean), and 35.2% when tested in all focus conditions.

Từ khóa

#Speech recognition #Broadcasting #Natural languages #Streaming media #System testing #Databases #Vocabulary #Multimedia systems #TV #Audio recording

Tài liệu tham khảo

bourlard, 1994, Connectionist Speech Recognition - A Hybrid Approach neto, 1997, The Design of a Large Vocabulary Speech Corpus for Portuguese, Proc Eurospeech 97 clarkson, 1997, Statistical Language Modelling Using the CMU-Cambridge Toolkit, Proceedings of Eurospeech 97 meinedo, 2000, Combination of acoustic models in continuous speech recognition hybrid systems, Proc ICSLP 2000 sliegler, 1997, Automatic Segmentation, Classification and clustering of Broadcast News, Proc DARPA Speech Recognition Workshop 10.1109/89.784107 kingsbury, 1998, Robust speech recognition using the modulation spectrogram, Speech Communication, 25, 117, 10.1016/S0167-6393(98)00032-6 10.1109/ICASSP.1992.225957 neto, 1998, A large vocabulary continuous speech recognition hybrid system for the Portuguese language, Proc ICSLP 98 rocha, 2000, CETEMPúblico: Um corpus de grandes dimensões de linguagem jornalística portuguesa, In Proceedings PROPOR’2000 amaral, 2001, The development of a Portuguese version of a media watch system, Proc Eurospeech 2001