Environmental sound classification using a regularized deep convolutional neural network with data augmentation

Applied Acoustics - Tập 167 - Trang 107389 - 2020
Zohaib Mushtaq1, Shun‐Feng Su1
1Department of Electrical Engineering, National Taiwan University of Science & Technology (NTUST), Taipei, Taiwan

Tóm tắt

Từ khóa


Tài liệu tham khảo

Crocco, 2016, Audio surveillance, ACM Comput Surv, 48, 1, 10.1145/2871183

Choi K, Fazekas G, Sandler M, Cho K. Transfer learning for music classification and regression tasks. In Proceedings of the 18th ISMIR conference, Suzhou, China, Oct 23–27, 2017.

Bian, 2019, Audio-based music classification with DenseNet and data augmentation, Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinf), LNAI, 11672, 56

Li, 2007, Robot navigation and sound based position identification, 2449

Vacher, 2014, Sound detection and classification for medical telesurvey

Jing, 2017, DCAR: a discriminative and compact audio representation for audio processing, IEEE Trans Multimed, 19, 2637, 10.1109/TMM.2017.2703939

Intani, 2013, Crime warning system using image and sound processing, Int Conf Control Autom Syst, 1751

Ali, 2018, Innovative method for unsupervised voice activity detection and classification of audio segments, IEEE Access, 6, 15494, 10.1109/ACCESS.2018.2805845

Ye, 2017, Audio data mining for anthropogenic disaster identification: an automatic taxonomy approach, IEEE Trans Emerg Top Comput, 6750, 1

Green, 2020, Environmental sound monitoring using machine learning on mobile devices, Appl. Acoust., 159, 107041, 10.1016/j.apacoust.2019.107041

Ramírez, 2019, Machine learning for music genre: multifaceted review and experimentation with audioset, J Intell Inf Syst, 1

Saon G et al., English conversational telephone speech recognition by humans and machines. In Proceedings of the annual conference of the international speech communication association, INTERSPEECH, vol. 2017; August, 2017. p. 132–6.

Zhou H, Song Y, Shu H. Using deep convolutional neural network to classify urban sounds. In IEEE region 10 annual international conference, proceedings/TENCON, vol. 2017; Dec, 2017. p. 3089–92.

Barchiesi, 2015, Acoustic scene classification: classifying environments from the sounds they produce, IEEE Signal Process Mag, 32, 16, 10.1109/MSP.2014.2326181

Chachada, 2014, Environmental sound recognition: a survey, APSIPA Trans Signal Inf Process, 3

Mesaros, 2016, TUT database for acoustic scene classification and sound event detection, Eur Signal Process Conf, 2016-Nov., 1128

Piczak KJ. ESC: dataset for environmental sound classification. In MM 2015 - proc. 2015 ACM multimed. conf.; 2015. p. 1015–8.

Salamon J, Jacoby C, Bello JP. A dataset and taxonomy for urban sound research. In MM ’14 proceedings of the 22nd ACM international conference on multimedia; 2014, no. 3. p. 1041–4.

Bountourakis, 2015, Machine learning algorithms for environmental sound recognition: towards soundscape semantics, ACM Int Conf Proc Ser, 07-09, 1

daSilva, 2019, Evaluation of classical Machine Learning techniques towards urban sound recognition on embedded systems, Appl Sci, 9, 1

Tokozume, 2018, Earning from between-class examples for deep sound recognition, 1

Chong D, Zou Y, Wang W. Multi-channel convolutional neural networks with multi-level feature fusion for environmental sound classification. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), LNCS, vol. 11296; 2019. p. 157–68.

Huzaifah M. Comparison of time-frequency representations for environmental sound classification using convolutional neural networks. In arXiv e-prints; 2017. p. 1–5.

Agrawal DM, Sailor HB, Soni MH, Patil HA. Novel TEO-based gammatone features for environmental sound classification. In 25th European signal processing conference, EUSIPCO 2017, vol. 2017-Jan; 2017. p. 1809–13.

Salamon, 2017, Deep convolutional neural networks and data augmentation for environmental sound classification, IEEE Signal Process Lett, 24, 279, 10.1109/LSP.2017.2657381

Chen, 2019, Environmental sound classification with dilated convolutions, Appl Acoust, 148, 123, 10.1016/j.apacoust.2018.12.019

Dai, 2017, Very deep convolutional neural networks for raw waveforms, 421

Khamparia, 2019, Sound classification using convolutional neural network and tensor deep stacking network, IEEE Access, 7, 7717, 10.1109/ACCESS.2018.2888882

Boddapati, 2017, Classifying environmental sounds using image recognition networks, Procedia Comput Sci, 112, 2048, 10.1016/j.procs.2017.08.250

Valero, 2012, Gammatone cepstral coefficients: biologically inspired features for non-speech audio classification, IEEE Trans Multimed, 14, 1684, 10.1109/TMM.2012.2199972

Li, 2017, A comparison of Deep Learning methods for environmental sound detection, 126

Cotton, 2011, Spectral vs. spectro-temporal features for acoustic event detection, 69

Chollet F. Image preprocessing - Keras documentation. GitHub, [Online]. Available at: https://keras.io/preprocessing/image/; 2015 [Accessed: 16-Nov-2019].

McFee B et al. librosa: audio and music signal analysis in python. In Proc. 14th python sci. conf., no. Scipy; 2015. p. 18–24.

Piczak KJ, 2015 IEEE international workshop on machine learning for signal processing environmental sound classification with convolutional neural networks. In IEEE international workshop on machine learning for signal processing, Boston, USA.

Zhang Z, Xu S, Cao S, Zhang S. Deep convolutional neural network with mixup for environmental sound classification. In Chinese conference on pattern recognition and computer vision (PRCV), vol. 2; 2018. p. 356–67.

Li, 2018, An ensemble stacked convolutional neural network model for environmental event sound recognition, Appl Sci, 8, 10.3390/app8071152

Zhang, 2019, learning attentive representations for environmental sound classification, IEEE Access, 7, 130327, 10.1109/ACCESS.2019.2939495