Multiclass support vector machines for environmental sounds classification in visual domain based on log-Gabor filters

International Journal of Speech Technology - Tập 16 - Trang 203-213 - 2012

Souli Sameh^1,2, Zied Lachiri²

¹Signal, Image and Pattern Recognition Research Unit, Dept. of Genie Electrique, ENIT, Le Belvédère, Tunisia

²Dept. of Physique and Instrumentation, INSAT, Centre Urbain, Tunisia

Tóm tắt

This paper presents an approach aimed at recognizing environmental sounds for surveillance and security applications. We propose a robust environmental sound classification approach, based on spectrograms features derive from log-Gabor filters. This approach includes three methods. In the first two methods, the spectrograms are passed through an appropriate log-Gabor filter banks and the outputs are averaged and underwent an optimal feature selection procedure based on a mutual information criteria. The third method uses the same steps but applied only to three patches extracted from each spectrogram. To investigate the accuracy of the proposed methods, we conduct experiments using a large database containing 10 environmental sound classes. The classification results based on Multiclass Support Vector Machines show that the second method is the most efficient with an average classification accuracy of 89.62 %.

Tài liệu tham khảo

Chu, S., Narayanan, S., & Kuo, C. C. J. (2009). Environmental sound recognition with time-frequency audio features. IEEE Transactions on Audio, Speech, and Language Processing, 17, 1142–1158. Dennis, J., Tran, H. D., & Li, H. (2011). Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Processing Letters, 18, 130–133. Ezzat, T., Bouvrie, J., & Poggio, T. (2007). Spectro-temporal analysis of speech using 2-d Gabor filters. In Proc. interspeech (pp. 1–4). He, L., Lech, M., Maddage, N., & Allen, N. (2009a). Stress and emotion recognition using log-Gabor filter. In Proc. of 3rd international conference on affective computing and intelligent interaction and workshops, ACII, Amsterdam (pp. 1–6). He, L., Lech, M., Maddage, N. C., & Allen, N. (2009b). Stress detection using speech spectrograms and sigma-pi neuron units. In Proc. of int. conf. on natural computation (pp. 260–264). Hsu, C.-W., & Lin, C.-J. (2002). A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13, 415–425. Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2009). A practical guide to support vector classification. Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan. Available: www.csie.ntu.edu.tw/~cjlin/. Kleinschmidt, M. (2002). Methods for capturing spectro-temporal modulations in automatic speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 88, 416–422. Kleinschmidt, M. (2003). Localized spectro-temporal features for automatic speech recognition. In Proc. Eurospeech (pp. 2573–2576). Kwak, N., & Choi, C. (2002). Input feature selection for classification problems. IEEE Transactions on Neural Networks, 13, 143–159. Kuncheva, L. I. (2004). Combining pattern classifiers methods and algorithms. New York: Wiley. ISBN 0-471-21078-1. Lamper, T. A., & O’Keefe, S. E. M. (2010). A survey of spectrogram track detection algorithms. Applied Acoustics, 71, 87–100. Leonardo Software website. http://www.leonardosoft.com. Mallat, S. (1999). A wavelet tour of signal processing (2nd edn.). San Diego: Academic Press. Mallat, S., & Peyré, G. (2007). A review of bandelet methods for geometrical image representation. Numerical Algorithms, 44, 205–234. Rabaoui, A., Davy, M., Rossignol, S., & Ellouze, N. (2008). Using one-class SVMs and wavelets for audio surveillance. IEEE Transactions on Information Forensics and Security, 3, 763–775. Scholkopf, B., & Smola, A. (2001). Learning with kernels. Cambridge: MIT Press. Schulz-Mir, H., Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., & Poggio, T. (2007). Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 411–426. Souli, S., & Lachiri, Z. (2011). Environmental sounds classification based on visual features. In Lecture notes on computer science: Vol. 7042. Proc. of CIARP, Chile (pp. 459–466). Berlin: Springer. Vapnik, V. N. (1999). An overview of statistical learning theory. IEEE Transactions on Neural Networks, 10, 988–999. Vapnik, V., & Chapelle, O. (2000). Bounds on error expectation for support vector machines. Neural Computation, 12, 2013–2036. Wang, J.-C., Lee, H.-P., Wang, J.-F., & Lin, C.-B. (2008). Robust environmental sound recognition for home automation. IEEE Transactions on Automation Science and Engineering, 5, 25–31. Xinyi, Z., Jianxiao, Y., & Qiang, H. (2009). Research of STRAIGHT spectrogram and difference subspace algorithm for speech recognition. In Int. congress on image and signal processing (CISP) (pp. 1–4). Yu, G., & Slotine, J. J. (2008). Fast wavelet-based visual classification. In Proc. of IEEE international conference on pattern recognition, ICPR, Tampa (pp. 1–5). Yu, G., & Slotine, J. J. (2009). Audio classification from time-frequency texture. In Proc. IEEE ICASSP, Taipei (pp. 1677–1680). Yu, G., Mallat, S., & Bacry, E. (2008). Audio denoising by time-frequency block thresholding. IEEE Transactions on Signal Processing, 56, 1830–1839.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA