Smoothed language model incorporation for efficient time-synchronous beam search decoding in LVCSR

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. - Trang 178-181

D. Willett¹, E. McDermott¹, S. Katagiri¹

¹Speech Open Lab, NTT Corporation, Kyoto, Japan

Tóm tắt

For performing the decoding search in large vocabulary continuous speech recognition (LVCSR) with hidden Markov models (HMM) and statistical language models, the most straightforward and popular approach is the time-synchronous beam search procedure. A drawback of this approach is that the time-asynchrony of the language model weight application during search leads to performance degradations. This is particularly so when performing the search with a tight pruning beam. This study presents a method for smoothing the language model within the recognition network. The optimization goal is the smearing of transition probabilities from HMM state to HMM state in favor of a more time-synchronous language model weight application. In addition, state-based language model look-ahead is proposed and evaluated. Both language model smoothing techniques lead to a remarkable improvement in accuracy-to-run-time ratio, while their combined application yields only limited improvements.

Từ khóa

#Decoding #Hidden Markov models #Acoustic beams #Degradation #Smoothing methods #Viterbi algorithm #Context modeling #Laboratories #Speech recognition #Natural languages

Tài liệu tham khảo

10.1109/ICASSP.1997.598876 steinbiss, 0, Improvements in Beam Search, ICSLP, 2143 odell, 1996, The Use of Context in Large Vocabulary Speech Recognition neukirchen, 0, Reduced Lexicon Trees for Decoding in a MMI-Connecionist/HMM Speech Recognition System, Eurospeech'97, 2639 young, 1989, Token Passing: a Simple Conceptual Model for Connected Speech Recognition Systems, Tech Rep TR furui, 0, Toward the Realization of Spontaneous Speech Recognition - Introduction of a Japanese Priority Program and Preliminary Results, ICSLP'00, 518 willett, 0, Time and Memory Efficient Viterbi Decoding for LVCSR using a Precompiled Search Network, Eurospeech'01 mohri, 1998, Network Optimizations for Large Vocabulary Speech Recognition, Speech Communication, 25 pereira, 1997, Speech Recognition by Composition of Weighted Finite Automata, Finite-State Language Processing

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA