Togneri R, Pullella D (2011) An overview of speaker identification: Accuracy and robustness issues. IEEE Circuits Syst Mag 11(2):23–61. https://doi.org/10.1109/MCAS.2011.941079
Dhakal P, Damacharla P, Javaid AY, Devabhaktuni V (2019) A near real-time automatic speaker recognition architecture for voice-based user interface. Mach Learn Knowl Extract 1(1):504–520. https://doi.org/10.3390/make1010031
Khan AU, Bhaiya LP, Banchhor SK (2012) Hindi speaking person identification using zero crossing rate. Int J Soft Comput Eng, 2(3):101–104
Bharti R, Bansal P (2015) Real time speaker recognition system using MFCC and vector quantization technique. Int J Comput Appl 117(1). https://doi.org/10.5120/20520-2361
Le PN, Ambikairajah E, Epps J et al (2011) Investigation of spectral centroid features for cognitive load classification. Speech Commun 53(4):540–551. https://doi.org/10.1016/j.specom.2011.01.005
Ghahremani P, BabaAli B, Povey D, Riedhammer K, Trmal J, Khudanpur S (2014) A pitch extraction algorithm tuned for automatic speech recognition. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp 2494–2498). IEEE. https://doi.org/10.1109/ICASSP.2014.6854049
Hossan MA, Memon S, Gregory MA (2010) A novel approach for MFCC feature extraction. In: 2010 4th International conference on signal processing and communication systems, pp 1–5. IEEE. https://doi.org/10.1109/ICSPCS.2010.5709752
Wang ZZ, Yong JH (2008) Texture analysis and classification with linear regression model based on wavelet transform. IEEE Trans Image Process 17(8):1421–1430. https://doi.org/10.1109/TIP.2008.926150
Noble WS (2006) What is a support vector machine? Nat Biotechnol 24(12):1565–1567. https://doi.org/10.1038/nbt1206-1565
Cunningham P, Delany SJ (2021) k-Nearest neighbour classifiers—a tutorial. ACM Comput Surv (CSUR) 54(6):1–25. https://doi.org/10.1145/3459665
Padi S, Sadjadi SO, Manocha D, Sriram RD (2022) Multimodal emotion recognition using transfer learning from speaker recognition and bert-based models. arXiv preprint arXiv:2202.08974. https://doi.org/10.48550/arXiv.2202.08974
Shivakumar PG, Georgiou P (2020) Transfer learning from adult to children for speech recognition: evaluation, analysis and recommendations. Comput Speech Lang 63:101077. https://doi.org/10.1016/j.csl.2020.101077
Beikmohammadi A, Faez K (2018) December. Leaf classification for plant recognition with deep transfer learning. In 2018 4th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS) (pp. 21–26). IEEE. https://doi.org/10.1109/ICSPIS.2018.8700547
Shahriar S, Tariq U (2021) Classifying maqams of qur’anic recitations using deep learning. IEEE Access 9:117271–117281. https://doi.org/10.1109/ACCESS.2021.3098415
Al-Ayyoub M, Damer NA, Hmeidi I (2018) Using deep learning for automatically determining correct application of basic quranic recitation rules. Int Arab J Inf Technol 15(3A):620–625
Bradbury J (2000) Linear predictive coding. Mc G. Hill
Schuller B, Rigoll G, Lang M (2003) Hidden Markov model-based speech emotion recognition. In: 2003 IEEE international conference on acoustics, speech, and signal processing, 2003. Proceedings. (ICASSP'03). IEEE. (vol 2, pp II-1). https://doi.org/10.1109/ICASSP.2003.1202279
Ting W, Guo-Zheng Y, Bang-Hua Y et al (2008) Eeg feature extraction based on wavelet packet decomposition for brain computer interface. Measurement 41(6):618–625. https://doi.org/10.1016/j.measurement.2007.07.007
Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th annual international conference on machine learning (pp 609–616). https://doi.org/10.1145/1553374.1553453
Alagrami AM, Eljazzar MM (2020) Smartajweed automatic recognition of Arabic quranic recitation rules. arXiv preprint arXiv:2101.04200. https://doi.org/10.48550/arXiv.2101.04200
Vaidyanathan PP (1990) Multirate digital filters, filter banks, polyphase networks, and applications: a tutorial. Proc IEEE 78(1):56–93. https://doi.org/10.1109/5.52200
Marlina L, Wardoyo C, Sanjaya WM, Anggraeni D, Dewi SF, Roziqin A, Maryanti S (2018) Makhraj recognition of Hijaiyah letter for children based on mel-frequency cepstrum coefficients (MFCC) and support vector machines (SVM) method. In: 2018 International conference on information and communications technology (ICOIACT) (pp 935–940). IEEE. https://doi.org/10.1109/ICOIACT.2018.8350684
Hamid R, Naim F, Naharuddin NZA (2013) Makhraj recognition for al-quran recitation using mfcc. Int J Intell Inf Process 4(2):45–53. https://doi.org/10.4156/ijiip.vol4.issue2.5
Alkhateeb JH (2020) A machine learning approach for recognizing the holy quran reciter. Int J Adv Comput Sci Appl 11(7). https://doi.org/10.14569/ijacsa.2020.0110735
Anazi M, Shahin OR (2022) A machine learning model for the identification of the holy quran reciter utilizing k-nearest neighbor and artificial neural networks. Inf Sci Lett 11(4):1093–1102.
Nahar KM, Al-Shannaq M, Manasrah A et al (2019) A holy quran reader/reciter identification system using support vector machine. Int J Mach Learn Comput 9(4):458–464.
Shah SM, Ahsan SN (2014) Arabic speaker identification system using combination of DWT and LPC features. In: 2014 International conference on open source systems and technologies. IEEE. (pp 176–181). https://doi.org/10.1109/ICOSST.2014.7029340
Shensa MJ et al (1992) The discrete wavelet transform: wedding the a trous and mallat algorithms. IEEE Trans Signal Process 40(10):2464–2482. https://doi.org/10.1109/78.157290
Chapaneri SV (2012) Spoken digits recognition using weighted MFCC and improved features for dynamic time warping. Int J Comput Appl 40(3):6–12.
Han W, Chan CF, Choy CS, Pun KP (2006). An efficient MFCC extraction method in speech recognition. In: 2006 IEEE international symposium on circuits and systems (ISCAS), IEEE. (pp 4). https://doi.org/10.1109/ISCAS.2006.1692543
Chakraborty S, Mondal R, Singh PK et al (2021) Transfer learning with fine tuning for human action recognition from still images. Multimedia Tools Appl 80:20547–20578. https://doi.org/10.1007/s11042-021-10753-y
Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp 8697–8710). https://doi.org/10.1109/CVPR.2018.00907
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, PMLR, pp 6105–6114.
Tan M, Le Q (2021) Efficientnetv2: Smaller models and faster training. In: International conference on machine learning, PMLR, pp 10096–10106
Vrbanˇciˇc G, Podgorelec V (2020) Transfer learning with adaptive fine-tuning. IEEE Access 8:196197–196211. https://doi.org/10.1109/ACCESS.2020.3034343
Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578. https://doi.org/10.48550/arXiv.1611.01578
Albawi S, Mohammed TA, Al-Zawi S (2017) Understanding of a convolutional neural network. In: 2017 International conference on engineering and technology (ICET) (pp 1–6). IEEE. https://doi.org/10.1109/ICEngTechnol.2017.8308186
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285. https://doi.org/10.1613/jair.301
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970. https://doi.org/10.1109/TPAMI.2008.128
Henderson P, Ferrari V (2017) End-to-end training of object class detectors for mean average precision. In: Computer vision–ACCV 2016: 13th Asian conference on computer vision, Taipei, Taiwan, November 20–24, 2016, Revised Selected Papers, Part V 13 (pp 198–213). Springer International Publishing. https://doi.org/10.48550/arXiv.1607.03476
Baheti B, Innani S, Gajre S, Talbar S (2020) Eff-unet: A novel architecture for semantic segmentation in unstructured environment. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp 358–359). https://doi.org/10.1109/CVPRW50498.2020.00187
Sunil CK, Jaidhar CD, Patil N (2021) Cardamom plant disease detection approach using EfficientNetV2. IEEE Access 10:789–804. https://doi.org/10.1109/ACCESS.2021.3138920
Gupta S, Jaafar J, Ahmad WW et al (2013) Feature extraction using mfcc. Signal Image Process Int J 4(4):101–108. https://doi.org/10.5121/sipij.2013.4408
Briggs WL, Henson VE (1995) The DFT: an owner’s manual for the discrete Fourier transform. Soc Ind Appl Math
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456. pmlr
Agarap AF (2018) Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375. https://doi.org/10.48550/arXiv.1803.08375
Dietterich T (1995) Overfitting and undercomputing in machine learning. ACM Comput Surv (CSUR) 27(3):326–327
Sharma S, Sharma S, Athaiya A (2017) Activation functions in neural networks. Towards Data Sci 6(12):310–316
Berrar D (2019) Cross-validation. Encyclopedia Bioin Comput Biol, pp 542–545. https://doi.org/10.1016/B978-0-12-809633-8.20349-X