The MERL SpokenQuery information retrieval system a system for retrieving pertinent documents from a spoken query

P. Wolf1, B. Raj1
1Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA, USA

Tóm tắt

This paper describes some key concepts developed and used in the design of a spoken-query based information retrieval system developed at the Mitsubishi Electric Research Labs (MERL). Innovations in the system include automatic inclusion of signature terms of documents in the recognizer's vocabulary, the use of uncertainty vectors to represent spoken queries, and a method of indexing that accommodates the usage of uncertainty vectors. This paper describes these techniques and includes experimental results that demonstrate their effectiveness.

Từ khóa

#Information retrieval #Speech recognition #Vocabulary #Engines #Uncertainty #Technological innovation #Indexing #Keyboards #Personal digital assistants #Cellular phones

Tài liệu tham khảo

turney, 1999, Learning to extract keyphrases from text, NRC Technical Report ERB-1057 10.1108/eb046814 evermann, 0, Large vocabuary recognition and confidence estimation using word posterior probabilities, Proc ICASSP 2000 cavnar, 1994, Using an N-gram based document representation with a vector processing retrieval model, Proc TREC, 3 10.1002/(SICI)1099-1506(199607/08)3:4<301::AID-NLA84>3.0.CO;2-S monz, 2000, Computational semantics and information retrieval, Proc Second Workshop on Inference in Computational Semantics breiman, 1984, Classi3cation and Regression Trees lee, 1999, Learning the parts of objects by non-negative matrix factorization, Nature, 401, 788, 10.1038/44565 berry, 1992, Large scale singular value computations, Intl Journal of Supercomputer Applications, 6, 13, 10.1177/109434209200600103