Ensemble of deep neural networks using acoustic environment classification for statistical model-based voice activity detection

Computer Speech & Language - Tập 38 - Trang 1-12 - 2016

Inyoung Hwang¹, Hyung-Min Park², Joon-Hyuk Chang¹

¹School of Electronic and Computer Engineering, Hanyang University, Seoul 133-791, Republic of Korea

²School of Electronic Engineering, Sogang University, Seoul 121-742, Republic of Korea

Tài liệu tham khảo

Bishop, 1995 Chang, 2001, Speech enhancement: new approaches to soft decision, IEICE Trans. Syst. Inf., E84-D, 1231 Choi, 2012, On using environment classification for statistical model-based speech enhancement, Speech Commun., 54, 477, 10.1016/j.specom.2011.10.009 Ephraim, 1984, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process., 32, 1109, 10.1109/TASSP.1984.1164453 Garofolo, 1993, TIMIT acoustic phonetic continuous speech corpus Hinton, 2006, Reducing the dimensionality of data with neural networks, Science, 313, 504, 10.1126/science.1127647 Hinton, 2006, A faster learning algorithm for deep belief nets, Neural Comput., 18, 1527, 10.1162/neco.2006.18.7.1527 Hinton, 2012, Deep neural networks for acoustic modeling in speech recognition, IEEE Signal Process. Mag., 29, 82, 10.1109/MSP.2012.2205597 Hirsch, 2000, The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions Jo, 2009, Statistical model-based voice activity detection using support vector machine, IET Signal Process., 3, 205, 10.1049/iet-spr.2008.0128 Kang, 2008, Discriminative weight training for a statistical model-based voice activity detection, IEEE Signal Process. Lett., 15, 170, 10.1109/LSP.2007.913595 Lee, 2007, Sparse deep belief net model for visual area v2 Mohamed, 2009, Deep belief networks for phone recognition Mohamed, 2012, Acoustic modeling using deep belief networks, IEEE Trans. Audio Speech Lang. Process., 20, 14, 10.1109/TASL.2011.2109382 Platt, 2000, Probabilistic outputs for support vector machines and comparison to regularized likelihood methods Ryant, 2013, Speech activity detection on YouTube using deep neural networks Sangwan, 2007, Environmentally aware voice activity detector Sohn, 1999, A statistical model-based voice activity detection, IEEE Signal Process. Lett., 1, 1, 10.1109/97.736233 Varga, 1993, Assessment for automatic speech recognition, II-NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems, Speech Commum., 12, 247, 10.1016/0167-6393(93)90095-3 Xia, 2014, Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification, Speech Commun., 60, 13, 10.1016/j.specom.2014.02.001 Yu, 2010, Discriminative training for multiple observation likelihood ratio based voice activity detection, IEEE Signal Process. Lett., 17, 897, 10.1109/LSP.2010.2066561 Zhang, 2013, Denoising deep neural networks based voice activity detection Zhang, 2013, Deep belief networks based voice activity detection, IEEE Trans. Audio Speech Lang. Process., 21, 3371

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA