Mixed source model and its adapted vocal tract filter estimate for voice transformation and synthesis
Tài liệu tham khảo
Alku, 1999, A method for generating natural-sounding speech stimuli for cognitive brain research, Clin. Neurophysiol., 110, 1329, 10.1016/S1388-2457(99)00088-7
Assembly, T.I.R., 2003. ITU-R BS.1284-1: EN-General methods for the subjective assessment of sound quality. Technical Report. ITU.
Bechet, 2001, Liaphon: un système complet de phonetisation de textes, Traitement Automatique des Langues, 42, 47
de Cheveigne, 2002, YIN, A fundamental frequency estimator for speech and music, J. Acoust. Society Amer., 111, 1917, 10.1121/1.1458024
Degottex, 2011, Phase minimization for glottal model estimation, IEEE Trans. Audio Speech Lang. Process., 19, 1080, 10.1109/TASL.2010.2076806
Fant, 1995, The LF-model revisited. Transformations and frequency domain analysis, STL-QPSR, 36, 119
Flanagan, J.L., Golden, R.M., 1966. Phase Vocoder. Technical Report. The Bell System Technical Journal.
Gales, 1999, Semi-tied covariance matrices for hidden markov models, IEEE Trans. Speech Audio Process., 7, 272, 10.1109/89.759034
Griffin, 1988, Multiband excitation vocoder, IEEE Trans. Acoust. Speech Signal Process., 36, 1223, 10.1109/29.1651
Henrich, N., 2001. Etude de la source glottique en voix parlée et chantée. Ph.D. thesis. UPMC, France (In French).
Hermes, 1991, Synthesis of breathy vowels: some research methods, Speech Comm., 10, 497, 10.1016/0167-6393(91)90053-V
Imai, 1979, Spectral envelope extraction by improved cepstral method, Electron. Comm., 10
Kawahara, 1999, Restructuring speech representations using a pitch-adaptative time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds, Speech Comm., 27, 187, 10.1016/S0167-6393(98)00085-5
Kim, 2007, Two-band excitation for HMM-based speech synthesis, IEICE – Trans. Inf. Systems, 378, 10.1093/ietisy/e90-1.1.378
Markel, 1976
McAulay, 1986, Speech analysis/synthesis based on a sinusoidal representation, IEEE Trans. Acoust. Speech Signal Process., 34, 744, 10.1109/TASSP.1986.1164910
Oppenheim, 1968, Nonlinear filtering of multiplied and convolved signals, Proc. IEEE, 56, 1264, 10.1109/PROC.1968.6570
Pantazis, 2010, Adaptive AM–FM signal decomposition with application to speech analysis, IEEE Trans. Audio Speech Lang. Process., 19, 290, 10.1109/TASL.2010.2047682
Peeters, G., 2001. Modeles et modification du signal sonore adaptees a ses caracteristiques locales. Ph.D. thesis. UPMC, France (In French).
Raitio, 2011, HMM-based speech synthesis utilizing glottal inverse filtering, IEEE Trans. Audio Speech Lang. Process., 19, 153, 10.1109/TASL.2010.2045239
Rodet, 1984, The CHANT project: from synthesis of the singing voice to synthesis in general, Comput. Music J., 8, 15, 10.2307/3679810
Roebel, 2007, On cepstral and all-pole based spectral envelope modeling with unknown model order, Pattern Recognition Lett., 28, 1343, 10.1016/j.patrec.2006.11.021
Stevens, 1971, Airflow and turbulence noise for fricative and stop consonants: static considerations, J. Acoust. Soc. Amer., 50, 1180, 10.1121/1.1912751
Stylianou, 2001, Applying the harmonic plus noise model in concatenative speech synthesis, IEEE Trans. Speech Audio Process., 9, 21, 10.1109/89.890068
Tooher, M., McKenna, J.G., 2003. Variation of the glottal LF parameters across F0, vowels, and phonetic environment. In: Proc. ISCA Voice Quality: Functions, Analysis and Synthesis (VOQUAL), pp. 41–46.
Yeh, C., 2008. Multiple fundamental frequency estimation of polyphonic recordings. Ph.D. thesis. UPMC-Ircam. France.
Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A., Tokuda, K., 2007. The HMM-based speech synthesis system (HTS) version 2.0. In: Proc. ISCA Workshop on Speech Synthesis (SSW). <http://hts.sp.nitech.ac.jp>.