Ensemble of deep neural networks using acoustic environment classification for statistical model-based voice activity detection
Tài liệu tham khảo
Bishop, 1995
Chang, 2001, Speech enhancement: new approaches to soft decision, IEICE Trans. Syst. Inf., E84-D, 1231
Choi, 2012, On using environment classification for statistical model-based speech enhancement, Speech Commun., 54, 477, 10.1016/j.specom.2011.10.009
Ephraim, 1984, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process., 32, 1109, 10.1109/TASSP.1984.1164453
Garofolo, 1993, TIMIT acoustic phonetic continuous speech corpus
Hinton, 2006, Reducing the dimensionality of data with neural networks, Science, 313, 504, 10.1126/science.1127647
Hinton, 2006, A faster learning algorithm for deep belief nets, Neural Comput., 18, 1527, 10.1162/neco.2006.18.7.1527
Hinton, 2012, Deep neural networks for acoustic modeling in speech recognition, IEEE Signal Process. Mag., 29, 82, 10.1109/MSP.2012.2205597
Hirsch, 2000, The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
Jo, 2009, Statistical model-based voice activity detection using support vector machine, IET Signal Process., 3, 205, 10.1049/iet-spr.2008.0128
Kang, 2008, Discriminative weight training for a statistical model-based voice activity detection, IEEE Signal Process. Lett., 15, 170, 10.1109/LSP.2007.913595
Lee, 2007, Sparse deep belief net model for visual area v2
Mohamed, 2009, Deep belief networks for phone recognition
Mohamed, 2012, Acoustic modeling using deep belief networks, IEEE Trans. Audio Speech Lang. Process., 20, 14, 10.1109/TASL.2011.2109382
Platt, 2000, Probabilistic outputs for support vector machines and comparison to regularized likelihood methods
Ryant, 2013, Speech activity detection on YouTube using deep neural networks
Sangwan, 2007, Environmentally aware voice activity detector
Sohn, 1999, A statistical model-based voice activity detection, IEEE Signal Process. Lett., 1, 1, 10.1109/97.736233
Varga, 1993, Assessment for automatic speech recognition, II-NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems, Speech Commum., 12, 247, 10.1016/0167-6393(93)90095-3
Xia, 2014, Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification, Speech Commun., 60, 13, 10.1016/j.specom.2014.02.001
Yu, 2010, Discriminative training for multiple observation likelihood ratio based voice activity detection, IEEE Signal Process. Lett., 17, 897, 10.1109/LSP.2010.2066561
Zhang, 2013, Denoising deep neural networks based voice activity detection
Zhang, 2013, Deep belief networks based voice activity detection, IEEE Trans. Audio Speech Lang. Process., 21, 3371