Detection of glottal closure instant and glottal open region from speech signals using spectral flatness measure
Tóm tắt
Từ khóa
Tài liệu tham khảo
Abberton, 1989, Laryngographic assessment of normal voice: a tutorial, Clin. Linguist. Phonet., 3, 263, 10.3109/02699208908985291
Airaksinen, 2014, Quasi closed phase glottal inverse filtering analysis with weighted linear prediction, IEEE/ACM Trans. Audio, Speech Lang. Process., 22, 596, 10.1109/TASLP.2013.2294585
Alku, 1992, Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering, Speech Commun., 11, 109, 10.1016/0167-6393(92)90005-R
Alku, 2011, Glottal inverse filtering analysis of human voice production - a review of estimation and parameterization methods of the glottal excitation and their applications, Sadhana, 36, 623, 10.1007/s12046-011-0041-5
Alku, 2009, Closed phase covariance analysis based on constrained linear prediction for glottal inverse filtering, J. Acoust. Soc. Am., 120, 3289, 10.1121/1.3095801
Ananthapadmanabha, 1979, Epoch extraction from linear prediction residual for identification of closed glottis interval, IEEE Trans. Speech Audio Process., 27, 309, 10.1109/TASSP.1979.1163267
Aneeja, 2015, Single frequency filtering approach for discriminating speech and nonspeech, IEEE/ACM Trans. Audio, Speech Lang. Process., 23, 705, 10.1109/TASLP.2015.2404035
Barney, 2007, The effect of glottal opening on the acoustic response of the vocal tract, Acta Acustica united with Acustica, 93, 1046
Bouzid, 2009, Voice source parameter measurement based on multi-scale analysis of electroglottographic signal, Speech Commun., 51, 782, 10.1016/j.specom.2008.08.004
Brookes, M.,. Voicebox: speech processing toolbox for matlab. Source: https://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.
Chi, 2007, Subglottal coupling and its influence on vowel formants, J. Acoust. Soc. Am., 122, 1735, 10.1121/1.2756793
Childers, 1984, A critical review of electroglottography, Crit. Rev. Biomed. Eng., 12, 131
Childers, 1994, Measuring and modeling vocal source-tract interaction, IEEE Trans. Biomed. Eng., 41, 663, 10.1109/10.301733
D Alessandro, 2011, Glottal closure instant and voice source analysis using time-scale lines of maximum amplitude, Sadhana, 36, 601, 10.1007/s12046-011-0040-6
Degottex, 2009, Glottal closure instant detection from a glottal shape estimate, 226
Degottex, 2010, Joint estimate of shape and time-synchronization of a glottal source model by phase flatness, 5058
Degottex, 2011, Function of phase-distortion for glottal model estimation, 4608
Degottex, 2011, Phase minimization for glottal model estimation, IEEE Transactions on Acoustics, Speech and Language Processing, 19, 1080, 10.1109/TASL.2010.2076806
Drugman, 2014, Glottal source processing: from analysis to applications, Comput. Speech Lang., 28, 1117, 10.1016/j.csl.2014.03.003
Drugman, 2011, Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation, Speech Commun., 53, 855, 10.1016/j.specom.2011.02.004
Drugman, 2012, A comparative study of glottal source estimation techniques, Comput. Speech Lang., 26, 20, 10.1016/j.csl.2011.03.003
Drugman, 2012, Detection of glottal closure instants from speech signals: a quantitative review, IEEE Trans. Audio Speech Lang. Process., 20, 994, 10.1109/TASL.2011.2170835
Fant, 1995, The LF-model revisited. transformations and frequency domain analysis, Speech Transm. Lab. Q. Progr.Status Report, 36, 119
Fu, 2006, Robust glottal source estimation based on joint source-filter model optimization, IEEE Trans. Audio Speech Lang.Process., 14, 492, 10.1109/TSA.2005.857807
Guerin, 1976, A voice source taking account of coupling with the supraglottal cavities, 1, 47
Henrich, 2011, Analysing and understanding the singing voice : recent progress and open questions, Curr. Bioinform., 6, 362, 10.2174/157489311796904709
Henrich, 2004, On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation, J. Acoust. Soc. Am., 115, 1321, 10.1121/1.1646401
Herbst, 2014, Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings, J. Exper. Biol., 217, 955, 10.1242/jeb.093203
ITU-T, Recommendation, 2005. G.191, software tools for speech and audio coding standardization. Source: http://www.itu.int/rec/T-REC-G.191-200509-I/en.
Jain, 2012, Time-order representation based method for epoch detection from speech signals, J. Intell. Syst., 21, 79
Jain, 2013, Gci identification from voiced speech using the eigen value decomposition of Hankel matrix, 371
Jain, 2014, Event-based method for instantaneous fundamental frequency estimation from voiced speech based on eigenvalue decomposition of the hankel matrix, IEEE/ACM Trans. Audio SpeechLang. Process., 22, 1467, 10.1109/TASLP.2014.2335056
Kadiri, 2018
Kadiri, 2015, Analysis of excitation source features of speech for emotion recognition, 1324
Kadiri, 2017, Epoch extraction from emotional speech using single frequency filtering approach, Speech Commun., 86, 52, 10.1016/j.specom.2016.11.005
Kadiri, 2018, Analysis and detection of phonation modes in singing voice using excitation source features and single frequency filtering cepstral coefficients (SFFCC), 441
Kadiri, 2018, Breathy to tense voice discrimination using zero-time windowing cepstral coefficients (ZTWCCs), 232
Kafentzis, 2011, Glottal inverse filtering using stabilised weighted linear prediction, 5408
Khanagha, 2014, Detection of glottal closure instants based on the microcanonical multiscale formalism, IEEE/ACM Trans. Audio, Speech Lang. Process., 22, 1941, 10.1109/TASLP.2014.2352451
Kominek, 2004, The CMU Arctic speech databases, 223
Krishnamurthy, 1986, Two-channel speech analysis, IEEE Trans. Audio Speech Signal Process., 34, 730, 10.1109/TASSP.1986.1164909
Larsson, 2000, Vocal fold vibrations: high-speed imaging, kymography and acoustic analysis: a preliminary report, Laryngoscope, 110, 2117, 10.1097/00005537-200012000-00028
Laver, 1980
Legát, 2011, On the detection of pitch marks using a robust multi-phase algorithm, Speech Commun., 53, 552, 10.1016/j.specom.2011.01.008
Lieberman, 1963, Some acoustic measures of the fundamental periodicity of normal and pathologic larynges, J. Acoust. Soc. Am., 35, 344, 10.1121/1.1918465
Lohscheller, 2008, Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics, IEEE Trans. Med. Imag., 27, 300, 10.1109/TMI.2007.903690
Lulich, 2009, Source-filter interaction in the opposite direction: subglottal coupling and the influence of vocal fold mechanics on vowel spectra during the closed phase, 6, 10.1121/1.3269926
Ma, 1994, A frobenius norm approach to glottal closure detection from the speech signal, IEEE Trans. Speech Audio Process., 2, 258, 10.1109/89.279274
Mehta, 2011, Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings, J. Speech Lang. Hear. Res., 54, 47, 10.1044/1092-4388(2010/10-0026)
Mittal, 2013, Effect of glottal dynamics in the production of shouted speech, J. Acoust. Soc. Am., 133, 3050, 10.1121/1.4796110
Moore, 2008, Critical analysis of the impact of glottal features in the classification of clinical depression in speech, IEEE Trans. Biomed. Eng., 55, 96, 10.1109/TBME.2007.900562
Moulines, 1990, Detection of the glottal closure by jumps in the statistical properties of the speech signal, Speech Commun., 9, 401, 10.1016/0167-6393(90)90017-4
Murty, 2008, Epoch extraction from speech signals, IEEE Trans. Audio Speech Lang. Process., 16, 1602, 10.1109/TASL.2008.2004526
Naylor, 2007, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Trans. Audio Speech Lang. Process., 15, 34, 10.1109/TASL.2006.876878
Prasad, 2016, Determination of glottal open regions by exploiting changes in the vocal tract system characteristics, J. Acoust. Soc. Am., 140, 666, 10.1121/1.4958681
Prathosh, 2013, Epoch extraction based on integrated linear prediction residual using plosion index, IEEE Trans. Audio Speech Lang. Process., 21, 2471, 10.1109/TASL.2013.2273717
Ramesh, 2013, Detection of glottal opening instants using Hilbert envelope, 44
Rao, 2007, Determination of instants of significant excitation in speech using hilbert envelope and group-delay function, IEEE Signal Process. Letters, 14, 762, 10.1109/LSP.2007.896454
Rothenberg, 1981, Acoustic interaction between the glottal source and the vocal tract, Vocal Fold Physiol., 305
Rothenberg, 1988, Monitoring vocal fold abduction through vocal fold contact area, J. Speech Hear. Res., 31, 338, 10.1044/jshr.3103.338
Schleusing, 2013, Joint source-filter optimization for accurate vocal tract estimation using differential evolution, IEEE Trans. Audio Speech Lang. Process., 21, 1560, 10.1109/TASL.2013.2255275
Silva, 2009, Jitter estimation algorithms for detection of pathological voices, EURASIP J. Adv. Signal Process., 10.1155/2009/567875
Source: https://covarep.github.io/covarep/.
Source: https://geostat.bordeaux.inria.fr/index.php/downloads.html.
Stevens, 1977, Physics of laryngeal behavior and larynx models, Phonetica, 34, 264, 10.1159/000259885
Thati, 2013, Synthesis of laughter by modifying excitation characteristics, J. Acoust. Soc. Am., 133, 3072, 10.1121/1.4798664
Thomas, 2012, Estimation of glottal closing and opening instants in voiced speech using the yaga algorithm, IEEE Trans. Audio Speech Lang. Process., 20, 82, 10.1109/TASL.2011.2157684
Thomas, 2009, The sigma algorithm: a glottal activity detector for electroglottographic signals, IEEE Trans. Audio Speech Lang. Process., 17, 1557, 10.1109/TASL.2009.2022430
Titze, 2004, Theory of glottal airflow and source-filter interaction in speaking and singing, Acta Acustica united with Acustica, 90, 641
Titze, 2008, Nonlinear source filter coupling in phonation: theory, J. Acoust. Soc. Am., 123, 2733, 10.1121/1.2832337
Veldhuis, 1998, A computationally efficient alternative for the liljencrants-Fant model and its perceptual evaluation, J. Acoust. Soc. Am., 103, 566, 10.1121/1.421103
Walker, 2007, A review of glottal waveform analysis, Springer Lecture Notes Comput. Sci. (LNCS), 4391, 1, 10.1007/978-3-540-71505-4_1
Wong, 1979, Least squares glottal inverse filtering from the acoustic speech waveform, IEEE Trans. Audio Speech Signal Process., 27, 350, 10.1109/TASSP.1979.1163260
Yan, 2006, Automatic tracing of vocal-fold motion from high-speed digital images, IEEE Trans. Biomed. Eng., 53, 1394, 10.1109/TBME.2006.873751
Yegnanarayana, 2011, Epoch-based analysis of speech signals, Sadhana, 36, 651, 10.1007/s12046-011-0046-0
Yegnanarayana, 2013, Spectro-temporal analysis of speech signals using zero-time windowing and group delay function, Speech Commun., 55, 782, 10.1016/j.specom.2013.02.007
Yegnanarayana, 1998, Extraction of vocal-tract system characteristics from speech signals, IEEE TASP, 6, 313