A template-based algorithm by geometric means for the automatic and efficient recognition of music chords
Evolutionary Intelligence - Trang 1-15 - 2022
Tóm tắt
In this work, we introduce a template-based computational method to recognize chords through an audio recording of a musical instrument. The algorithm is based on a temporal frequency analysis using Gabor’s filter banks. These filters are centered over adjusted frequencies of musical notes in different octaves and the adjustment is accomplished in terms of the detunings on the recording. Using the results in the filtering stage, a geometric mean of each chord is calculated. It is important to mention that these statistics are calculated from the combination of notes that form each chord and are automatically grouped as templates. The presence of chords is determined from these metrics. Several experiments are carried out for major, minor, augmented, diminished and suspended chords played on acoustic guitar, classic guitar, electric guitar, piano and ukulele. A comparative study against machine-learning classifiers is presented. The results show a superior performance of the present approach. In addition, the proposed method presents the advantage that it does not require a training stage, in contrast with the methods based on machine-learning algorithms. This reduces significatively the storage and time requiered for processing.
Tài liệu tham khảo
Fenrich P (2014) Practical principles of instructional design, media selection, and interface design with a focus on computer-based training/educational software. Informing Science, Santa Rosa
Simon I, Morris D, Basu S (2008) Mysong: automatic accompaniment generation for vocal melodies. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 725–734
Herrera-Boyer P, Klapuri A, Davy M (2006) Automatic classification of pitched musical instrument sounds. In: Signal processing methods for music transcription. Springer, Boston, pp 163–200
Downie JS (2003) Music information retrieval. Ann Rev Inf Sci Technol 37(1):295–340
Müller M (2015) Fundamentals of music processing: audio, analysis, Algorithms, Applications. Springer, Switzerland
Muller M, Ellis DP, Klapuri A, Richard G (2011) Signal processing for music analysis. IEEE J Select Top Signal Process 5(6):1088–1110
Micchi C, Kosta KEA (2021) A deep learning method for enforcing coherence in automatic chord recognition. In: Proceedings of the 22nd international society for music information retrieval conference (1), pp 443–451
Mabpa P, Sapaklom T, Mujjalinvimut E, Kunthong J, Na Ayudhya PN (2021) Automatic chord recognition technique for a music visualizer application. In: 2021 9th international electrical engineering congress (iEECON), pp 416–419
O’Hanlon K, Sandler MB (2021) Fifthnet: structured compact neural networks for automatic chord recognition. IEEE/ACM Trans Audio Speech Lang Process 29:2671–2682
Li T (2021) Study on a CNN-HMM approach for audio-based musical chord recognition. J Phys Conf Ser 1802(3):032033
Ru Y (2021) Computer assisted chord detection using deep learning and YOLOV4 neural network model. J Phys Conf Ser 2083(4):042017
Rarità L, Stamova I, Tomasiello S (2021) Numerical schemes and genetic algorithms for the optimal control of a continuous model of supply chains. Appl Math Comput 388:125464
de Falco M, Gaeta M, Loia V, Rarita L, Tomasiello S (2016) Differential quadrature-based numerical solutions of a fluid dynamic model for supply chains. Commun Math Sci 14(5):1467–1476
Wu Y, Carsault T, Nakamura E, Yoshii K (2020) Semi-supervised neural chord estimation based on a variational autoencoder with latent chord labels and features. IEEE/ACM Trans Audio Speech Lang Process 28:2956–2966
Shukla S, Banka H (2018) An automatic chord progression generator based on reinforcement learning. In: 2018 international conference on advances in computing, communications and informatics (ICACCI), pp 55–59. IEEE
Tomasiello S (2011) A functional network to predict fresh and hardened properties of self-compacting concretes. Int J Numer Methods Biomed Eng 27(6):840–847
Bando Y, Tanaka M (2022) A chord recognition method of guitar sound using its constituent tone information. IEEJ Trans Electr Electron Eng 17(1):103–109
Tomasini MC (2007) El fundamento matemático de la escala musical y sus raíces pitagóricas. C &T Universidad de Palermo, pp 15–27
Meyer J (2009) Acoustics and the performance of music: manual for acousticians, audio engineers, musicians, architects and musical instrument makers. Springer, Berlin
Lárez V (2010) Armonía. Universidad Nacional Experimental de las Artes-UNEARTE, Caracas
Schmidt-Jones C (2013) Understanding basic music theory. Rice University, Houston
International Organization for Standardization: ISO/IEC 13818-3:1998. ISO (1998)
Kamagalakshmi KEC (2014) Log-Gabor orientation with run-length code based fingerprint feature extraction approach. Glob J Comput Sci Technol 14(1)
Mehrotra R, Namuduri KR, Ranganathan N (1992) Gabor filter-based edge detection. Pattern Recognit 25(12):1479–1494
Guerrero J, Marroquin J, Rivera M, Quiroga J (2005) Adaptive monogenic filtering and normalization of ESPI fringe patterns. Opt Lett 30(22):3018–3020
Galton F (1879) Xii. the geometric mean, in vital and social statistics. Proc Royal Soc Lond 29(196–199):365–367
Learned-Miller EG (2014) Introduction to supervised learning. I: Department of Computer Science, University of Massachusetts
Ligges U, Preusser A, Thieler A, Mielke J, Weihs C et al (2018) Package ‘tuner’. https://cran.r-project.org/web/packages/tuneR/tuneR.pdf. Accessed 15 Apr 2022
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Peters A, Hothorn T, Hothorn MT (2009) Package ‘ipred’. R Package, 2009
Quinlan C (1993) Programs for machine learning morgan kaufmann. San Francisco
Kuhn M, Weston S, Culp M, Coulter N, Quinlan R (2015) Package ‘C50’. CRAN, UTC
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth Int Group 37(15):237–251
Therneau T, Atkinson B, Ripley B, Ripley MB (2015) Package ‘rpart’. Available at: cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf. Accessed on 20 April 2016
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
RColorBrewer S, Liaw MA (2018) Package ‘randomforest’. University of California, Berkeley
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Greenwell B, Boehmke B, Cunningham J, Developers G, Greenwell MB (2019) Package ‘gbm’. R package version 2(5)
Zhang H (2004) The optimality of Naive Bayes. AA 1(2):3
Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chang C-C, Lin C-C, Meyer MD (2019) Package ‘e1071’. R J
Hastie T, Tibshirani R, Buja A (1994) Flexible discriminant analysis by optimal scoring. J Am Stat Assoc 89(428):1255–1270
Hastie MT (2020) Package ‘mda’
Ripley B, Venables W, Ripley MB (2016) Package ‘nnet’. R Package Vers 7(3–12):700
Ripley BD (2007) Pattern recognition neural network. Cambridge University Press, New York
Cunningham P, SJ D (2007) k-Nearest neighbour classifiers. Mult Classif Syst. Springer
Ripley B, Venables W, Ripley MB (2015) Package ‘class’. The Comprehensive R Archive Network 11
Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning; data mining, inference and prediction. Springer (2009)
Venables WN, Ripley BD et al (1999) Modern applied statistics with S-PLUS. Springer, New York
Ripley B, Venables B, Bates DM, Hornik K, Gebhardt A, Firth D, Ripley MB (2013) Package ‘mass’. CRAN R 538:113–120
Team R Core (2013) R: a language and environment for statistical computing. Vienna
Linardatos P, Papastefanopoulos V, Kotsiantis S (2020) Explainable ai: a review of machine learning interpretability methods. Entropy 23(1):18
Molnar C (2020) Interpretable machine learning. Lulu. com, Berlin