Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement

Speech Communication - Tập 55 - Trang 358-376 - 2013
Pei Chee Yong1, Sven Nordholm1, Hai Huyen Dam1
1Curtin University Kent Street Bentley, WA, 6102, Australia

Tài liệu tham khảo

Alam, 2009, Comparative study of a priori signal-to-noise ratio (SNR) estimation approaches for speech enhancement, J. Electr. Electron. Eng., 9, 809, 10.1109/TASSP.1979.1163209 Andrianakis, 2009, Speech spectral amplitude estimators using optimally shaped gamma and chi priors, Speech Comm., 51, 1, 10.1016/j.specom.2008.05.018 Berouti M., Schwartz R., Makhoul J., 1979. Enhancement of speech corrupted by acoustic noise. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP’79), vol. 4, pp. 208–211. Boll, 1979, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust. Speech Signal Process, 27, 113, 10.1109/TASSP.1984.1164453 Breithaupt, 2011, Analysis of the decision-directed SNR estimator for speech enhancement with respect to low-SNR and transient conditions, IEEE Trans. Audio Speech Lang. Process., 19, 277, 10.1109/TASSP.1985.1164550 Breithaupt, C., Gerkmann, T., Martin, R., 2008. A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing. In: Proc. IEEE Internat. Conf. on Acoustics Speech, and Signal Processing (ICASSP’08), pp. 4897–4900. Breithaupt, C., Krawczyk, M., Martin, R., 2008. Parameterized MMSE spectral magnitude estimation for the enhancement of noisy speech. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP’08), pp. 4037–4040. Cappé, 1994, Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor, IEEE Trans. Speech Audio Process., 2, 345, 10.1016/j.specom.2010.02.004 Chang, 2006, Voice activity detection based on multiple statistical models, IEEE Trans. Signal Process., 54, 1965, 10.1016/j.specom.2011.09.003 Cohen, 2004, Speech enhancement using a noncausal a priori SNR estimator, IEEE Signal Process. Lett., 11, 725, 10.1109/LSP.2004.833478 Davis, A., Nordholm, S., Low, S.Y., Togneri, R., 2006. A multi-decision sub-band voice activity detector. In: Proc. 14th European Signal Processing Conf. (EUSIPCO’06), Florence, Italy. Ephraim, 1984, Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process., 32, 1109, 10.1109/TASL.2006.872621 Ephraim, 1985, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process., 33, 443, 10.1093/ietcom/e90-b.8.2182 Gustafsson, 2002, A psychoacoustic approach to combined acoustic echo cancellation and noise reduction, IEEE Trans. Speech Audio Process., 10, 245, 10.1109/TSA.2002.800553 Hansen, J., Pellom, B., 1998. An effective quality evaluation protocol for speech enhancement algorithms. In: Proc. Internat. Conf. on Spoken Language Processing, pp. 2819–2822. Hendriks, R., Heusdens, R., Jensen, J., 2010. MMSE based noise PSD tracking with low complexity. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP’10), pp. 4266 –4269. Hu, 2008, Evaluation of objective quality measures for speech enhancement, IEEE Trans. Audio Speech Lang. Process., 16, 229, 10.1109/TASL.2010.2047681 Loizou, 2007 Paliwal, 2010, Single-channel speech enhancement using spectral subtraction in the short-time modulation domain, Speech Comm., 52, 450, 10.1016/j.specom.2010.02.004 Paliwal, 2012, Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator, Speech Comm., 54, 282, 10.1109/LSP.2009.2018225 Park, 2007, A novel approach to a robust a priori SNR estimator in speech enhancement, IEICE Trans. Comm., E90-B, 2182, 10.1016/j.specom.2008.05.018 Plapous, 2006, Improved signal-to-noise ratio estimation for speech enhancement, IEEE Trans. Audio Speech Lang. Process., 14, 2098, 10.1109/TASL.2006.872621 Plourde, 2009, Generalized bayesian estimators of the spectral amplitude for speech enhancement, IEEE Signal Process. Lett., 16, 485, 10.1109/LSP.2009.2018225 Quackenbush, 1988 Rix, A., Beerends, J., Hollier, M., Hekstra, A., 2001. Perceptual evaluation of speech quality (PESQ), a new method for speech quality assessment of telephone networks and codecs. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP’01), vol. 2, pp. 749–752. Scalart, P., 1996. Speech enhancement based on a priori signal to noise estimation. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP’96), vol. 2, pp. 629–632. Suhadi, 2011, A data-driven approach to a priori SNR estimation, IEEE Trans. Audio Speech Lang. Process., 19, 186, 10.1109/TASL.2010.2045799 Uemura, Y., Takahashi, Y., Saruwatari, H., Shikano, K., Kondo, K., 2008. Automatic optimization scheme of spectral subtraction based on musical noise assessment via higher-order statistics. In: Proc. Internat. Workshop on Acoustic Echo and Noise Control (IWAENC’08), Seattle, USA. Yong, P.C., Nordholm, S., Dam, H.H., Low, S.Y., 2011. On the optimization of sigmoid function for speech enhancement. In: Proc. 19th Eur. Signal Process. Conf. (EUSIPCO’11), Barcelona, Spain, pp. 211–215.