Automatic selection of transcribed training material

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. - Trang 417-420

T.M. Kamm¹, G.G.L. Meyer¹

¹Center for Language and Speech Processing, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA

Tóm tắt

Conventional wisdom says that incorporating more training data is the surest way to reduce the error rate of a speech recognition system. This, in turn, guarantees that speech recognition systems are expensive to train, because of the high cost of annotating training data. We propose an iterative training algorithm that seeks to improve the error rate of a speech recognizer without incurring additional transcription cost, by selecting a subset of the already available transcribed training data. We apply the proposed algorithm to an alpha-digit recognition problem and reduce the error rate from 10.3% to 9.4% on a particular test set.

Từ khóa

#Speech recognition #Error analysis #Iterative algorithms #Training data #Costs #System testing #Natural languages #Speech processing #Data mining #Automatic speech recognition

Tài liệu tham khảo

noel, 1997, Alphadigits, Center for Spoken Lang Understand Oregon Graduate Inst Sci Technol Portland OR kemp, 1999, Unsupervised Training of a Speech Recognizer: Recent Experiments, Proc EUROSPEECH, 2725 hamaker, 1998, Advances in Alpha Digit Recognition Using Syllables, Proc ICASSP, 421 hamaker, 1997, A proposal for a standard partitioning of the OGI AlphaDigit corpus, Inst Signal Inform Process Mississippi State Univ 10.1006/jcss.1997.1504 young, 1999, The HTK Book Version 2 2 lamel, 2000, Lightly Supervised Acoustic Model Training, presented at ISCA ITRW Workshop on Automatic Speech Recognition Challenges for the New Millennium zavaliagkos, 1998, Utilizing Untranscribed Training Data to Improve Performance, presented at Broadcast News Transcription and Understanding Workshop

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA