Parametric representation of excitation source information for language identification

Computer Speech & Language - Tập 41 - Trang 88-115 - 2017
Dipanjan Nandi1, Debadatta Pati2, K. Sreenivasa Rao1
1School of Information Technology, Indian Institute of Technology, Kharagpur, 721302, West Bengal, India
2Department of Electronics and Communication Engineering, National Institute of Technology, Nagaland, 797103, India

Tài liệu tham khảo

Ananthapadmanabha, 1982, Calculation of true glottal flow and its components, Speech Commun, 1, 167, 10.1016/0167-6393(82)90015-2 Atal, 1972, Automatic speaker recognition based on pitch contours, J. Acoustic. Soc. Am, 52, 1687, 10.1121/1.1913303 Audacity Team Bajpai, 2004 Balleda, 2000 Dehak, 2011, Front-end factor analysis for speaker verification, IEEE Trans. Audio Speech Lang. Proc, 19, 788, 10.1109/TASL.2010.2064307 Dehak, 2011 Dempster, 1977, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Stat. Soc. B, 39, 1 Dominguez, 2014 Gray, 1974, A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis, IEEE Trans. Acoustic Speech Signal Proc, 22, 207, 10.1109/TASSP.1974.1162572 Gupta, 2002 Hayakawa, 1997, Speaker identification using harmonic structure of LP-residual spectrum, vol. 1206, 253 Informer Technologies Inc. Jothilakshmi, 2012, A hierarchical language identification system for Indian languages, Digital Signal Proc, 22, 544, 10.1016/j.dsp.2011.11.008 Li, 2014, Simplified supervised i-vector modeling with application to robust and efficient language identification and speaker verification, Comput. Speech Lang, 28, 940, 10.1016/j.csl.2014.02.004 Maity, 2012 Makhoul, 1975, Linear prediction: a tutorial review, IEEE Proc, 63, 561, 10.1109/PROC.1975.9792 Martinez, 2011 Mary, 2006 Mary, 2004 2012 Moreno, 2014 Morgan, 1992 Murthy, 2008, Epoch extraction from speech signal, IEEE Trans. Audio Speech Lang. Proc, 16, 1602, 10.1109/TASL.2008.2004526 Murthy, 2009, Characterization of glottal activity from speech signal, IEEE Signal Proc. Lett, 16, 469, 10.1109/LSP.2009.2016829 Murty, 2008, Epoch extraction from speech signals, IEEE Trans. Audio Speech Lang. Proc, 16, 1602, 10.1109/TASL.2008.2004526 Muthusamy, 1992 Naylor, 2007, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Trans. Audio Speech Lang. Proc, 15, 34, 10.1109/TASL.2006.876878 Pati, 2011, Subsegmental, segmental and suprasegmental processing of linear prediction residual for speaker information, Int. J. Speech Technol, 14, 49, 10.1007/s10772-010-9087-8 Pati, 2013, A comparative study of explicit and implicit modelling of subsegmental speaker-specific excitation source information, Sadhana, 38, 591, 10.1007/s12046-013-0163-z Plumpe, 1999, Modeling of the glottal flow derivative waveform with application to speaker identification, IEEE Trans. Speech Audio Proc, 7, 569, 10.1109/89.784109 Qi, 1994, A simplified approximation of the four-parameter LF model of voice source, J. Acoust. Soc. Am, 96, 1182, 10.1121/1.410392 Rabiner, 1993 Rao, 2013, Characterization and recognition of emotions from speech using excitation source information, Int. J. Speech Technol, 16, 181, 10.1007/s10772-012-9175-z Rao, 2013, Pitch synchronous and glottal closure based speech analysis for language recognition, Int. J. Speech Technol, 16, 413, 10.1007/s10772-013-9193-5 Reddy, 2013, Identification of Indian languages using multi-level spectral and prosodic features, Int. J. Speech Technol, 16, 489, 10.1007/s10772-013-9198-0 Reynolds, 1995, Robust text-independent speaker identification using Gaussian mixture speaker models, IEEE Trans. Speech Audio Proc, 3, 72, 10.1109/89.365379 Singh, 2013 Sugiyama, 1991 Travadi, 2014 Vanishree, 2011 Website cell, IT Unit, NSD Wolf, 1972, Efficient acoustic parameters for speaker recognition, J. Acoustic. Soc. Am, 51, 2044, 10.1121/1.1913065 Yegnanarayana, 1997 Yegnanarayana, 2005, Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system, IEEE Trans. Speech Audio Proc, 13, 575, 10.1109/TSA.2005.848892 Yegnenarayana, 2009, Event based instantaneous fundamental frequency estimation from speech signals, IEEE Trans. Audio Speech Lang. Proc, 17, 614, 10.1109/TASL.2008.2012194 Zissman, 1993