Out-of-vocabulary word modeling using multiple lexical fillers

G. Boulianne1, P. Dumouchel1
1Centre de recherche informatique de Montréal, Montréal, Québec, Canada

Tóm tắt

In large vocabulary speech recognition, out-of-vocabulary words are an important cause of errors. We describe a lexical filler model that can be used in a single pass recognition system to detect out-of-vocabulary words and reduce the error rate. When rescoring word graphs with better acoustic models, word fillers cause a combinatorial explosion. We introduce a new technique, using several thousand lexical fillers, which produces word graphs that can be rescored efficiently. On a large French vocabulary continuous speech recognition task, lexical fillers achieved an OOV detection rate of 44% and allowed a 23% reduction in errors due to OOV words.

Từ khóa

#Vocabulary #Speech recognition #Dictionaries #Acoustic signal detection #Explosions #Robustness #Natural languages #Error analysis #Degradation #Character recognition

Tài liệu tham khảo

mou, 2001, Sublexical modelling using a finite state transducer framework, Proc ICASSP 2001 bazzi, 2000, Modeling out-of-vocabulary words for robust speech recognition, Proc ICSLP 2000 dolmazon, 2000, Premiere campagne aupelf d'evaluation des systemes de dictée vocale: organisation et resultats, preparation dolmazon, 1997, Arc bl-organisation de la premiere campagne aupelf pour l’ evaluation des systemes de dictée vocale, JST97 FRANCIL 10.1007/BFb0031388 10.1109/ICASSP.2000.862072 adda, 1997, Text normalization and speech recognition in french, Proc Eurospeech '97, 10.21437/Eurospeech.1997-684 10.1016/0167-6393(94)90038-8