Arabic vowels recognition based on wavelet average framing linear prediction coding and neural network

Speech Communication - Tập 55 - Trang 641-652 - 2013
K. Daqrouq1, K.Y. Al Azzawi2
1Electrical and Comp. Eng. Dept., King Abdulaziz Univesity, Jeddah, Saudi Arabia
2Electromechanical Engineering Dept., Univ. of Technology, Baghdad, Iraq

Tài liệu tham khảo

Abu-Rabia, 1999, The effect of Arabic vowels on the reading comprehension of second- and sixth-grade native Arab children, J. Psycholinguist. Res., 28, 93, 10.1023/A:1023291620997 Alghamdi, 1998, A spectrographic analysis of Arabic vowels: a cross-dialect study, J. King Saud Univ., 10, 3 Alotaibi, Y., Hussain, A., 2009. Formant based analysis of spoken Arabic vowels. In: Proc. BioID_MultiComm, Madrid, Spain. Alotaibi Y., Hussain A., 2009. Speech recognition system and formant based analysis of spoken Arabic vowels. In: Proc. First International Conference, December, FGIT, Jeju Island, Korea, pp. 10–12. Alotaibi, 2005, Investigating spoken Arabic digits in speech recognition setting, Inform. Sci., 173, 115, 10.1016/j.ins.2004.07.008 Amrouche, A., et al., 2009. An efficient speech recognition system in adverse conditions using the nonparametric regression. In: Engineering Applications of Artificial Intelligence. Amrouche, A., Rouvaen, J.M., 2003. Arabic isolated word recognition using general regression neural network. In: Proc. 46th IEEE MWSCAS, 689–692. Anani, M., 1999, Arabic vowel formant frequencies. In: Proc. 14th International Congress of Phonetic Sciences, Vol. 9. San Francisco, CA; pp. 2117–2119. Andrianopoulos, 2001, Multimodal standardization of voice among four multicultural populations: fundamental frequency and spectral characteristics, J. Voice, 15, 194, 10.1016/S0892-1997(01)00021-2 Atal, 2006, The history of linear prediction, Signal Process. Mag. IEEE, 23, 154, 10.1109/MSP.2006.1598091 Avci, 2009, An expert system for speaker identification using adaptive wavelet sure entropy, Expert Syst. Appl., 36, 6295, 10.1016/j.eswa.2008.07.012 Avci, 2007, A new optimum feature extraction and classification method for speaker recognition: GWPNN, Expert Syst. Appl., 32, 485, 10.1016/j.eswa.2005.12.004 Avci, 2006, An expert discrete wavelet adaptive network based fuzzy inference system for digital modulation recognition, Expert Syst. Appl., 33, 582, 10.1016/j.eswa.2006.06.001 Cherif, 2001, Pitch detection and formant analysis of Arabic speech processing, Appl. Acoust., 62, 1129, 10.1016/S0003-682X(01)00007-X Daqrouq, 2011, Wavelet entropy and neural network for text-independent speaker identification, Eng. Appl. Artif. Intel., 24, 796, 10.1016/j.engappai.2011.01.001 Daqrouq, 2010, An investigation of speech enhancement using wavelet filtering method, Int. J. Speech Technol., 13, 101, 10.1007/s10772-010-9073-1 Daqrouq, 2009, Spoken Arabic Digit Classifier via Sophisticated Wavelet Transform Features Extraction Method, Int. J. Inform. Sci. Comput. Eng. (IJISCE), 1 Daubechies, 1988, Orthonormal bases of compactly supported wavelets, Comm. Pure Appl. Math., 41, 909, 10.1002/cpa.3160410705 Delac, 2009, Face recognition in JPEG and JPEG2000 compressed domain, Image Vis. Comp., 27, 1108, 10.1016/j.imavis.2008.10.007 Engin, 2007, A new optimum feature extraction and classification method for speaker recognition: GWPNN, Expert Syst. Appl., 32, 485, 10.1016/j.eswa.2005.12.004 Gowdy, J., Tufekci, Z., 2000. Mel-scaled discrete wavelet coefficients for speech recognition In: Proc. ICASSP, vol. 3, pp. 1351–13554. Hachkar Z., Mounir B., Farchi A., 2011, El Abbadi J., Comparison of MFCC and PLP Parameterization in pattern recognition of Arabic Alphabet Speech, Canadian Journal on Artificial Intelligence, In: Machine Learning & Pattern Recognition Vol. 2, No. 3. Hermansky, 1990, Perceptual linear predictive (PLP) analysis of speech, J. Acoust. Soc. Am., 87, 1738, 10.1121/1.399423 Hermansky, H., Morgan, N., Hirsch, H.G., 1993. Recognition of speech in additive and convolutional noise based on RASTA spectral processing. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 83–86. Jongman, 2011, Acoustics and perception of emphasis in Urban Jordanian Arabic, J. Phonetics, 39, 85, 10.1016/j.wocn.2010.11.007 Junqua, 1994, A robust algorithm for word boundary detection in the presence of noise, IEEE Trans. Speech Audio Process., 2, 406, 10.1109/89.294354 Karray, 2003, Towards improving speech detection robustness for speech recognition in adverse conditions, Speech Commun., 40, 261, 10.1016/S0167-6393(02)00066-3 Kirschhoff, K., 2003, Novel approach to Arabic speech recognition. Final Report from the JHU Summer School Workshop. In: Proc. International Conference on ASSP (ICASSP’03), pp. 344–347. Kirchoff K., Bilmes J., Das S., et al., 2002. Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins summer workshop. In: Technical report, Johns Hopkins University. Kotnik, B., Kacic, Z., Horvat, B., 2003. The usage of wavelet packet transform in automatic noisy speech recognition systems. In: Proc. IEEE EUROCON, Slovenia, pp. 131–134. Lazli, 2003, Connectionist probability estimation in HMM Arabic speech recognition using fuzzy logic, Lect. Notes LNCS, 2734, 379 Lee, 2008, The development of monophthongal vowels in Korean: age and sex differences, Clin. Linguist. Phon., 7, 523, 10.1080/02699200801945120 Lei, 2005, A nove wavelet packet division multiplexing based on maximum likelihood algorithm and optimum pilot symbol assisted modulation for Rayleigh fading channels, Circ. Syst. Signal Process., 24, 287, 10.1007/s00034-004-0529-x Vishwanath, 1994, The recursive pyramid algorithm for the discrete wavelet transform, IEEE Trans. Signal Process., 42, 673, 10.1109/78.277863 Mallat, 1989, A theory for multiresolution signal decomposition, IEEE Trans. PAMI, 11, 674, 10.1109/34.192463 Mallat, 1998 Mayo, 1995, Fundamental frequency, perturbation, and vocal tract resonance characteristics of African-American and white American males, J. Natl. Black Assoc. Speech Lang. Hear., 17, 32 Mokbel, C., Jouvet, D., Monn_e, J., 1995. Blind equalization using adaptive filtering for improving speech recognition over telephone. In: European Conference on Speech Communication and Technology, pp. 141–1990. Mokbel, 1997, Towards improving ASR robustness for PSN and GSM telephone applications, Speech Commun., 23, 141, 10.1016/S0167-6393(97)00042-3 Natour, 2010, Formant frequency characteristics in normal Arabic-speaking Jordanians, J. Voice, 25, e75, 10.1016/j.jvoice.2010.10.018 Saeed K., Nammous M.K., A new step in Arabic speech identification: spoken digit recognition, 2005. In: Saeed, K., Pejas, J. (Eds), Information Processing and Security Systems, Springer Science+Business Media, New York, pp. 55–66 Saeed, K., Nammous, M., 2005. Heuristic method of Arabic speech recognition. In: Proc. IEEE International Conference on Digital Signal Processing and its Applications (IEEE DSPA’05), pp. 528–530. Selouani, S.A., Douglas O., 2001. Hybrid architectures for complex phonetic features classification: a unified approach, In: International Symposium on Signal Processing and its Applications (ASSPA), Kuala Lumpur, Malaysia, pp. 719–722. Titze, 1995 Tryon, 2001, Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: an integrated alternative method of conducting null hypothesis statistical tests, Psychol. Methods, 6, 371, 10.1037/1082-989X.6.4.371 Tufekci, Z., Gowdy, J., 2000. Feature extraction using discrete wavelet transform for speech recognition. In: Proc. SOUTHEASTCON, pp. 116–123. Uchida S., Ronee M.A., Sakoe H. 2002. Using eigen-deformations in handwritten character recognition, In: Proc. 16th ICPR, Vol. 1, pp. 572–575. Wasserman, 1992, A critical appraisal of 98.6 degrees F, the upper limit of the normal body temperature, and other legacies of Carl Reinhold August Wunderlich, J. Am. Med. Assoc., 268, 1578, 10.1001/jama.1992.03490120092034 Wu, 2009, Speaker identification using discrete wavelet packet transform technique with irregular decomposition, Expert Syst. Appl., 363136 Wu, J.-D., Lin, B.-F., 2009, Speaker identification based on the frame linear predictive coding. Xue, 2006, Volumetric measurements of vocal tracts for male speakers from different races, Clin. Linguist. Phon., 20, 691, 10.1080/02699200500297716 Zitouni, 2009, Arabic diacritic restoration approach based on maximum entropy models, Comput. Speech Lang., 23, 257, 10.1016/j.csl.2008.06.001