Low rank sparse decomposition model based speech enhancement using gammatone filterbank and Kullback–Leibler divergence

International Journal of Speech Technology - Tập 21 Số 2 - Trang 217-231 - 2018

Nasir Saleem¹, Gohar Ijaz¹

¹Department of Electrical Engineering, Faculty of Engineering and Technology, Gomal University, Dera Ismail Khan, Pakistan

Tóm tắt

Từ khóa

Tài liệu tham khảo

Benesty, J., Chen, J., Huang, Y. A., & Doclo, S. (2005). Study of the Wiener filter for noise reduction. In Speech enhancement (pp. 9–41). Berlin: Springer.

Boldt, J., Kjems, U., Pedersen, M. S., Lunner, T., & Wang, D. (2008). Estimation of the ideal binary mask using directional systems. In: Proceedings of the International Workshop on Acoustic Echo and Noise Control.

Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.

Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3(1), 1–122.

Candès, E. J., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis? Journal of the ACM (JACM), 58(3), 11.

De Moor, B. (1993). The singular value decomposition and long and short spaces of noisy matrices. IEEE Transactions on Signal Processing, 41(9), 2826–2838.

Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.

Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 33(2), 443–445.

Ephraim, Y., & Van Trees, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3(4), 251–266.

Hermus, K., & Wambacq, P. (2006). A review of signal subspace speech enhancement and its application to noise robust speech recognition. EURASIP Journal on Advances in Signal Processing, 2007(1), 045821.

Hirsch, H. G., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Automatic speech recognition: Challenges for the new Millenium ISCA Tutorial and Research Workshop.

Hu, G., & Wang, D. (2004). Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Transactions on Neural Networks, 15(5), 1135–1150.

Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.

Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.

Huang, J., Zhang, X., Zhang, Y., Zou, X., & Zeng, L. (2014). Speech denoising via low-rank and sparse matrix decomposition. ETRI Journal, 36(1), 167–170.

Huang, P. S., Chen, S. D., Smaragdis, P., & Hasegawa-Johnson, M. (2012). Singing-voice separation from monaural recordings using robust principal component analysis. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 57–60.

Jolliffe, I. T. (2002). Principal component analysis and factor analysis. In Principal component analysis. New York: Springer, 150–166.

Li, Y., & Wang, D. (2009). On the optimality of ideal binary time–frequency masks. Speech Communication, 51(3), 230–239.

Liang, S., Liu, W., & Jiang, W. (2012). Integrating binary mask estimation with MRF priors of cochleagram for speech separation. IEEE Signal Processing Letters, 19(10), 627–630.

Liutkus, A., & Badeau, R. (2015). Generalized Wiener filtering with fractional power spectrograms. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 266–270). IEEE.

Loizou, P. C. (2007). Subjective evaluation and comparison of speech enhancement algorithms. Speech Communication, 49, 588–601.

Loizou, P. C. (2013). Speech enhancement: theory and practice. New York: CRC Press.

Manohar, K., & Rao, P. (2006). Speech enhancement in nonstationary noise environments using noise properties. Speech Communication, 48(1), 96–109.

Martin, R. (2001). Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Transactions on Speech and Audio Processing, 9(5), 504–512.

Mavaddaty, S., Ahadi, S. M., & Seyedin, S. (2016). A novel speech enhancement method by learnable sparse and low-rank decomposition and domain adaptation. Speech Communication, 76, 42–60.

Messaoud, MAB., & Bouzid, A. (2017). Sparse representations for single channel speech enhancement based on voiced/unvoiced classification. Circuits, Systems, and Signal Processing, 36(5), 1912–1933.

Min, G., Zhang, X., Zou, X., & Sun, M. (2016). Mask estimate through Itakura-Saito nonnegative RPCA for speech enhancement. IEEE International Workshop on Acoustic Signal Enhancement pp. 1–5.

Rangachari, S., & Loizou, P. C. (2006). A noise-estimation algorithm for highly non-stationary environments. Speech Communication, 48(2), 220–231.

Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001). Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, pp. 749–752.

Saleem, N. (2017). Single channel noise reduction system in low SNR. International Journal of Speech Technology, 20(1), 89–98.

Saleem, N., & Irfan, M. (2017). Noise reduction based on soft masks by incorporating SNR uncertainty in frequency domain. Circuits, Systems, and Signal Processing. https://doi.org/10.1007/s00034-017-0684-5 .

Saleem, N., Mustafa, E., Nawaz, A., & Khan, A. (2015a). Ideal binary masking for reducing convolutive noise. International Journal of Speech Technology, 18(4), 547–554.

Saleem, N., Shafi, M., Mustafa, E., & Nawaz, A. (2015b). A novel binary mask estimation based on spectral subtraction gain-induced distortions for improved speech intelligibility and quality. University of Engineering and Technology Taxila. Technical Journal, 20(4), 36.

Scalart, P. (1996). Speech enhancement based on a priori signal to noise estimation. IEEE International Conference on Acoustics, Speech, and Signal Processing.

Soon, I. Y., & Koh, S. N. (2000). Low distortion speech enhancement. IEEE Proceedings-Vision, Image and Signal Processing, 147(3), 247–253.

Sorensen, K. V., & Andersen, S. V. (2005). Speech enhancement with natural sounding residual noise based on connected time-frequency speech presence regions. EURASIP Journal on Advances in Signal Processing, 2005(18), 305909.

Sun, D. L., & Fevotte, C. (2014). Alternating direction method of multipliers for non-negative matrix factorization with the beta-divergence. IEEE International Conference on Acoustics, Speech and Signal Processing pp. 6201–6205.

Taal, C. H., Hendriks, R. C., Heusdens, R., & Jensen, J. (2011). An algorithm for intelligibility prediction of time–frequency weighted noisy speech. IEEE Transactions on Audio, Speech, and Language Processing, 19(7), 2125–2136.

Wang, D., & Brown, G. J. (2006). Computational auditory scene analysis: Principles, algorithms, and applications. Hoboken, NJ: Wiley-IEEE Press.

Wang, D., Kjems, U., Pedersen, M. S., Boldt, J. B., & Lunner, T. (2008). Speech perception of noise with binary gains. The Journal of the Acoustical Society of America, 124(4), 2303–2307.

Wang, H. Y., Zhao, X. H., & Gu, H. J. (2011). Speech enhancement using super gauss mixture model of speech spectral amplitude. The Journal of China Universities of Posts and Telecommunications, 18, 13–18.

Wiem, B., & Aicha, B. (2016). Single channel speech separation based on PCA and Fuzzy logic. Neural Parallel & Scientific Computations, 24, 489–504.

Wright, J., Ganesh, A., Rao, S., Peng, Y., & Ma, Y. (2009). Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. Advances in neural information processing systems (pp. 2080–2088).

Zhou, T., & Tao, D. (2011). Godec: Randomized low-rank & sparse matrix decomposition in noisy case. In International conference on machine learning.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA