Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator
Tài liệu tham khảo
Atlas, 2003, Joint acoustic and modulation frequency, EURASIP J. Appl. Signal Process., 2003, 668, 10.1155/S1110865703305013
Atlas, L., Li, Q., Thompson, J., 2004. Homomorphic modulation spectra. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Process. (ICASSP), Vol. 2, Montreal, Quebec, Canada, pp. 761–764.
Berouti, M., Schwartz, R., Makhoul, J., 1979. Enhancement of speech corrupted by acoustic noise. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Process (ICASSP), Vol. 4. Washington, DC, USA, pp. 208–211.
Boll, 1979, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust. Speech Signal Process., ASSP-27, 113, 10.1109/TASSP.1979.1163209
Breithaupt, 2011, Analysis of the decision-directed snr estimator for speech enhancement with respect to low-snr and transient conditions, IEEE Trans. Audio Speech Lang. Process., 19, 277, 10.1109/TASL.2010.2047681
Cappe, 1994, Elimination of the musical noise phenomenon with the ephraim and malah noise suppressor, IEEE Trans. Speech Audio Process., 2, 345, 10.1109/89.279283
Cohen, 2005, Relaxed statistical model for speech enhancement and a priori SNR estimation, IEEE Trans. Speech Audio Process., 13, 870, 10.1109/TSA.2005.851940
Cohen, 2002, Noise estimation by minima controlled recursive averaging for robust speech enhancement, IEEE Signal Process. Lett., 9, 12, 10.1109/97.988717
Drullman, 1994, Effect of reducing slow temporal modulations on speech reception, J. Acoust. Soc. Amer., 95, 2670, 10.1121/1.409836
Ephraim, 1984, Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process., ASSP-32, 1109, 10.1109/TASSP.1984.1164453
Ephraim, 1985, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process., ASSP-33, 443, 10.1109/TASSP.1985.1164550
Falk, T.H., Chan, W.-Y., 2008. A non-intrusive quality measure of dereverberated speech. In: Proc. Internat. Workshop Acoust. Echo Noise Control.
Falk, 2010, Modulation spectral features for robust far-field speaker identification, IEEE Trans. Audio Speech Lang. Process., 18, 90, 10.1109/TASL.2009.2023679
Falk, T., Stadler, S., Kleijn, W.B., Chan, W.-Y., 2007. Noise suppression based on extending a speech-dominated modulation band. In: Proc. ISCA Conf. Internat. Speech Commun. Assoc. (INTERSPEECH) Antwerp, Belgium, pp. 970–973.
Falk, 2010, A non-intrusive quality and intelligibility measure of reverberant and dereverberated speech, IEEE Trans. Audio Speech Lang. Process., 18, 1766, 10.1109/TASL.2010.2052247
Gray, 1980, Distortion measures for speech processing, IEEE Trans. Acoust. Speech Signal Process., ASSP-28, 367, 10.1109/TASSP.1980.1163421
Greenberg, S., Kingsbury, B., 1997. The modulation spectrogram: In persuit of an invariant representation of speech. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Process. (ICASSP), Vol. 3. Munich, Germany, pp. 1647–1650.
Hermansky, H., Wan, E., Avendano, C., 1995. Speech enhancement based on temporal processing. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Process (ICASSP), Vol. 1. Detroit, MI, USA, pp. 405–408.
Hu, 2007, Subjective comparison and evaluation of speech enhancement algorithms, Speech Comm., 49, 588, 10.1016/j.specom.2006.12.006
Huang, 2001
ITU-T P.835, 2007. Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm: Additional provisions for non-stationary noise suppressors. ITU-T P.835 Recommendation, Amendment 1.
Kim, 2004, A cue for objective speech quality estimation in temporal envelope representations, IEEE Signal Process. Lett., 11, 849, 10.1109/LSP.2004.835466
Kim, 2005, Anique: An auditory model for single-ended speech quality estimation, IEEE Trans. Speech Audio Process., 13, 821, 10.1109/TSA.2005.851924
Kingsbury, 1998, Robust speech recognition using the modulation spectrogram, Speech Comm., 25, 117, 10.1016/S0167-6393(98)00032-6
Lim, 1979, Enhancement and bandwith compression of noisy speech, Proc. IEEE, 67, 1586, 10.1109/PROC.1979.11540
Loizou, 2005, Speech enhancement based on perceptually motivated Bayesian estimators of the magnitude spectrum, IEEE Trans. Speech Audio Process., 13, 857, 10.1109/TSA.2005.851929
Loizou, 2007
Lyons, J., Paliwal, K., 2008. Effect of compressing the dynamic range of the power spectrum in modulation filtering based speech enhancement. In: Proc. ISCA Conf. Internat. Speech Commun. Assoc. (INTERSPEECH), Brisbane, Australia, pp. 387–390.
Martin, R., 1994. Spectral subtraction based on minimum statistics. In: Proc. EURASIP European Signal Process. Conf. (EUSIPCO), Edinburgh, Scotland, pp. 1182–1185.
Martin, 2001, Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech Audio Process., 9, 504, 10.1109/89.928915
McAulay, 1980, Speech enhancement using a soft-decision noise suppression filter, IEEE Trans. Acoust. Speech Signal Process., 28, 137, 10.1109/TASSP.1980.1163394
Paliwal, 2008, Effect of analysis window duration on speech intelligibility, IEEE Signal Process. Lett., 15, 785, 10.1109/LSP.2008.2005755
Paliwal, 2010, Single-channel speech enhancement using spectral subtraction in the short-time modulation domain, Speech Comm., 52, 450, 10.1016/j.specom.2010.02.004
Paliwal, 2011, Role of modulation magnitude and phase spectrum towards speech intelligibility, Speech Comm., 53, 327, 10.1016/j.specom.2010.10.004
Picone, 1993, Signal modeling techniques in speech recognition, Proc. IEEE, 81, 1215, 10.1109/5.237532
Quackenbush, 1988
Quatieri, 2002
Rabiner, 2010
Rix, A., Beerends, J., Hollier, M., Hekstra, A., 2001. Perceptual Evaluation of Speech Quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codes. ITU-T Recommendation P.862.
Scalart, P., Filho, J., 1996. Speech enhancement based on a priori signal to noise estimation. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Process. (ICASSP), Vol. 2. Atlanta, Georgia, USA, pp. 629–632.
Shannon, B., Paliwal, K., 2006. Role of phase estimation in speech enhancement. In: Proc. Internat. Conf. on Spoken Language Process (ICSLP), Pittsburgh, PA, USA, pp. 1423–1426.
Sim, 1998, A parametric formulation of the generalized spectral subtraction method, IEEE Trans. Speech Audio Process., 6, 328, 10.1109/89.701361
So, 2011, Modulation-domain kalman filtering for single-channel speech enhancement, Speech Comm., 53, 818, 10.1016/j.specom.2011.02.001
Sohn, 1999, A statistical model-based voice activity detection, IEEE Signal Process. Lett., 6, 1, 10.1109/97.736233
Thompson, J., Atlas, L., 2003. A non-uniform modulation transform for audio coding with increased time resolution. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Process (ICASSP), Vol. 5. Hong Kong, pp. 397–400.
Tyagi, V., McCowan, I., Bourland, H., Misra, H., 2003. On factorizing spectral dynamics for robust speech recognition. In: Proc. ISCA European Conf. on Speech Commun. and Technology (EUROSPEECH), Geneva, Switzerland, pp. 981–984.
Vary, 2006
Virag, 1999, Single channel speech enhancement based on masking properties of the human auditory system, IEEE Trans. Speech Audio Process., 7, 126, 10.1109/89.748118
Wang, 1982, The unimportance of phase in speech enhancement, IEEE Trans. Acoust. Speech Signal Process., ASSP-30, 679, 10.1109/TASSP.1982.1163920
Wiener, 1949
Wu, S., Falk, T., Chan, W.-Y., 2009. Automatic recognition of speech emotion using long-term spectro-temporal features. In: Internat. Conf. on Digital Signal Process.
Zadeh, 1950, Frequency analysis of variable networks, Proc. IRE, 38, 291, 10.1109/JRPROC.1950.231083