Fusion of multi-stream speech features for dialect classification
Tóm tắt
Từ khóa
Tài liệu tham khảo
Zissman MA, Gleason TP, Rekart DM, Losiewicz BL (1996) Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech. In: Proceedings on IEEE Conference acoustics, speech, signal processing, vol. 2, pp 777–780
Kumpf K, King RW (1996) Automatic accent classification of foreign accented Australian English speech. Proc Fourth IEEE Int Conf Spok Lang 3:1740–1743
Torres-Carrasquillo PA, Sturim D, Reynolds D, McCree A (2008) Eigen-channel compensation and discriminatively trained gaussian mixture models for dialect and accent recognition. In Interspeech, Brisbane
Huang R, Hansen JHL, Angkititrakul P (2007) Dialect/accent classification using unrestricted audio. IEEE Trans Audio Speech Lang Process 15(2):453–464
Wells JC (1982) Accent of English, vol 2. Cambridge University Press, London
Arslan LM, Hansen JHL (1996) Language accent classification in american English. Speech Commun 18:353–367
Rouas J-L, Farinas J, Pellegrino F, Andre-Obrecht R (2003) Modeling prosody for language identification on read and spontaneous speech. In: Proceedings on IEEE international conference acoustical, speech, signal process, Hong Kong, vol 1, pp 40–43
Yan Q, Vaseghi S, Rentzos D, Ho C-H, Turajlic E (2003) Analysis of acoustic correlates of British, Australian and American accents. In IEEE workshop on automatic speech recognition and understanding, pp 345–350
Ma B, Zhu D, Tong R (2006) Chinese dialect identification using tone features based on pitch flux. In: Proceedings of ICASP’06, pp 901–904
Alorfi FS (2008) Automatic identification of Arabic dialects using hidden markov models. PhD Dissertation, University of Pittsburgh
Peters J, Gilles P, Auer P, Selting M (2002) Identification of regional varieties by intonational cues. An experimental study on Hamburg and Berlin German. Lang Speech 45(2):115–139
Barkat M, Ohala J, Pellegrino F (1999). Prosody as a distinctive feature for the discrimination of Arabic dialects. In: Proceedings of Eurospeech’99, p 1
Hamdi R, Barkat-Defradas M, Ferragne E, Pellegrino F (2004) Speech timing and rhythmic structure in Arabic dialects: a comparison of two approaches. In: Proceedings of interspeech’04
Blackburn CS, Vonwiller JP, King RW (1993) Automatic accent classification using artificial neural networks. In: Proceedings of Eurospeech ‘93, Vol 2, pp 1241–1244
Rao KS, Koolagudi SG (2011) Identification of Hindi dialects and emotions using spectral and prosodic features of speech. IJSCI 9(4):24–33
Biadsy F, Hirschberg J, Ellis DPW (2011) Dialect and accent recognition using phonetic-segmentation supervectors. In: Interspeech, Florence
Hou J, Liu Y, Zheng TF, Olsen J, Tian J (2010) Multi-layered features with SVM for Chinese accent identification. In: IEEE international conference on audio language and image processing (ICALIP), pp 25–30
Lazaridis A, Khoury E, Goldman JP, Avanzi M, Marcel S, Garner PN (2014) Swiss French regional accent identification. In: Proceedings of Odyssey, Joensuu
Yegnanarayana B, Kishore SP (2002) AANN: an alternative to GMM for pattern recognition. IEEE Trans Neural Netw 15:459–469
Sinha S, Jain A, Agrawal SS (2014) Speech processing for Hindi dialect recognition. Adv Signal Process Intell Recognit Syst 264:161–169
Rubio Ayuso AJ, Lopez Soler JM (1995) Speech recognition and coding new advances and trends. Springer, New York
Ghorshi S, Vaseghi S, Yan Q (2008) Cross-entropic comparison of formants of British, Australian and American English accents. Speech Commun 50:564–579
Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366
Rong T (2006) Automatic speaker and language identification. A First Year Report Submitted to the School of Computer Engineering of the Nanyang Technological University, Nanyang
Behravan H (2012) Dialect and accent recognition. Dissertation, University of Eastern Finland
Torres-Carrasquillo PA, Singer E, Kohler MA, Greene RJ, Ryenolds DA, Deller JR, Jr. (2002) Approaches to language identification using Gaussian mixture models and shifted delta Cepstral features. In: Proceedings on ICSLP, Denver, pp 89–92
Calvo J, Fernndez R, Hernndez G (2007) Channel/handset mismatch evaluation in a biometric speaker verification using shifted delta Cepstral features. In: Proceedings of CIARP 2007, LNCS 4756, pp 96–105
Esther G, Brechtje P, Francis N, Kimberley F (2000) Pitch accent realization in four varieties of British English. J Phon 28:161–185
Ganapathiraju A et al (2001) Syllable-based large vocabulary continuous speech recognition. IEEE Trans Speech Audio Process 9(4):358–366
Deivapalan PG, Jha M, Guttikonda R, Murthy HA (2008) Donlabel: an automatic labeling tool for Indian languages. In: National conference on communications, Bombay
Ladefoged P (1996) Elements of acoustic phonetics, 2nd edn. The University of Chicago Press, Chicago
Zheng DC et al (2012) A new approach to acoustic analysis of two British regional accents—Birmingham and Liverpool accents. Int J Speech Technol 15(2):77–85
Kumpf K, King RW (1997) Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks. In: Proceedings on Eurospeech, pp 2323–2326
Mehrabani M, Boril H, Hansen JH (2010) Dialect distance assessment method based on comparison of pitch pattern statistical models. In: Proceedings on IEEE international conference on acoustics speech and signal processing (ICASSP), pp 785–797
Chang, E et al. (2000) Large vocabulary Mandarin speech recognition with different approaches in modeling tones. In: Proceedings on Interspeech
Chen CJ et al. (1997) New methods in continuous Mandarin speech recognition. In: Proceedings on Eurospeech, Spain
Grover C, Jamieson DG, Dobrovolsky MB (1987) Intonation in English, French and German: perception and production. Lang Speech 30:277–296
Kulshreshtha M, Mathur R (2012) Dialect accent feature for establishing speaker identity: a case study. In: Neustein A (ed) Springer briefs in electrical and computer engineering. Springer, New York
Zahorian SA, Hu H (2008) A spectral/temporal method for robust fundamental frequency tracking. J Acoust Soc Am 123(6):4559–4571
Biadsy F (2011) Automatic dialect and accent recognition and its application to speech recognition. Dissertation, Columbia University
Gonzalez DR, Calvo de Lara JR (2009) Speaker verification with shifted delta Cepstral features: its pseudo-prosodic behaviour. In: Proceedings on I Iberian SLTech
Zolney A, Kocharov D, Schluter R, Ney H (2007) Using multiple acoustic feature sets for speech recognition. Speech Commun 49:514–525
Aggarwal RK, Dave M (2013) Performance evaluation of sequentially combined heterogeneous feature streams for Hindi speech recognition system. Telecommun Syst 52(3):1457–1466
Kramer MA (1991) Non linear principal component analysis using auto associative neural networks. AIChE J 37:233–243
Sivaram GSVS, Thomas S, Hermansky H (2011) Mixture of auto-associative neural networks for speaker verification. In: Proceedings on INTERSPEECH
Bianchini M, Frasconi P, Gori M (1995) Learning in multilayered networks used as autoassociators. IEEE Trans Neural Netw 6:512–515
Campbell WM et al (2006) Support vector machines for speaker and language recognition. Comput Speech Lang 20(2):210–229