Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

Speech Communication - Tập 26 Số 4 - Trang 283-297 - 1998
Nagendra Kumar1, Andreas G. Andreou1
1Electrical and Computer Engineering Department, Center for Language and Speech Processing, Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD 21218, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Akaike, 1974, A new look at the statistical identification model, IEEE Transactions on Automatic Control, 19, 716, 10.1109/TAC.1974.1100705

Aubert, X., Haeb-Umbach, R., Ney, H., 1993. Continuous mixture densities and linear discriminant analysis for improved context-dependent acoustic models. In: Proc. of ICASSP, Vol. 2, pp. 648–651

Ayer, C.M., 1992. A discriminatively derived transform capable for improved speech recognition accuracy. Ph.D. Thesis, University of London

Ayer, C.M., Hunt, M.J., Brookes, D.M., 1993. A discriminately derived linear transform for improved speech recognition. In: Proc. Eurospeech 93, Vol. 1, pp. 583–586

Bartlett, 1947, Multivariate analysis, J. Roy. Statist. Soc. B, 9, 176

Baum, 1970, A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains, Ann. Math. Stat., 41, 164, 10.1214/aoms/1177697196

Bocchieri, 1993, Discriminative feature selection for speech recognition, Computer Speech and Language, 7, 229, 10.1006/csla.1993.1012

Brown, P.F., 1987. The acoustic-modelling problem in automatic speech recognition. Ph.D. Thesis, Carnegie Mellon University

Campbell, 1984, Canonical variate analysis – a general formulation, Australian Journal of Statistics, 26, 86, 10.1111/j.1467-842X.1984.tb01271.x

Cohen, 1989, Application of an auditory model to speech recognition, J. Acoust. Soc. Amer., 85, 2623, 10.1121/1.397756

Davis, 1980, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentances, IEEE Transactions on Acoustics, Speech, and Signal Processing, 28, 357, 10.1109/TASSP.1980.1163420

Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via EM algorithm. J. Roy. Statist. Soc. 1–38

Dillon, W.R., Goldstein, M., 1984. Multivariate Analysis. Wiley, New York

Doddington, G., 1989. Phonetically sensitive discriminants for improved speech recognition. In: Proceedings 1989 ICASSP, no. S10 b.11, pp. 556–559

Duda, R.O., Hart, P.B., 1973. Pattern Classification and Scene Analysis. Wiley, New York

Engle, R.F., 1995. ARCH: Selected Readings. Oxford Univ. Press, Oxford

Fisher, 1936, The use of multiple measurements in taxonomic problems, Ann. Eugen., 7, 179, 10.1111/j.1469-1809.1936.tb02137.x

Fisher, 1938, The statistical utilization of multiple measurements, Ann. Eugen., 8, 376, 10.1111/j.1469-1809.1938.tb02189.x

Fukunaga, K., 1990. Introduction to Statistical Pattern Recognition. Academic Press, New York

Furui, 1986, Speaker-independent isolated word recognition using dynamic features of speech spectrum, IEEE Transactions on Acoustics, Speech, and Signal Processing, 34, 52, 10.1109/TASSP.1986.1164788

Haeb-Umbach R., Ney H., 1992. Linear discriminant analysis for improved large vocabulary continuous speech recognition. In: Proc. ICASSP, Vol. 1, pp. 13-16

Haeb-Umbach, R., Geller, D., Ney, H., 1993. Improvement in connected digit recognition using linear discriminant analysis and mixture densities. In: Proceedings of ICASSP, pp. 239–242

Hastie, T., Tibshirani, R., 1994. Discriminant analysis by gaussian mixtures. Tech. Rep., AT&T Bell Laboratories

Hermansky, 1990, Perceptual linear predictive (plp) analysis of speech, J. Acoust. Soc. Amer., 87, 1738, 10.1121/1.399423

Hunt, M., 1979. A statistical approach to metrics for word and syllable recognition. In: 98th Meeting of the Acoustical Society of America, November

Hunt M.J., Lefebvre C., 1989. A comparison of several acoustic representations for speech recognition with degraded and undegraded speech. In: Proc. ICASSP, Vol. 1, pp. 262-265

Jankowski Jr., C.R., 1992. A comparison of auditory models for automatic speech recognition. Master's Thesis, MIT

Kumar, N., 1997. Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition. Ph.D. Thesis, Johns Hopkins University, http://olympus.ece.jhu.edu/archives/phd/nkumar97/index.html

Kumar, N., Andreou, A., 1996a. On generalizations of linear discriminant analysis. Tech. Rep., Electrical and Computer Engineering Technical Report-96-07, April

Kumar, N., Andreou, A., 1996b. Generalization of linear discriminant analysis in maximum likelihood framework. In: Proceedings of Joint Meeting of American Statistical Association, Chicago, IL, August

Kumar, N., Andreou, A., submitted. Heteroscedastic discriminant analysis: maximum likelihood feature extraction for heteroscedastic models. IEEE Transactions on Pattern Analysis and Machine Intelligence

Kumar, N., Neti, C., Andreou, A., 1995. Application of discriminant analysis to speech recognition with auditory features. In: Proceedings of the 15th Annual Speech Research Symposium, Johns Hopkins University, Baltimore, MD, pp. 153–160, June

Rabiner, 1975, An algorithm for determining the endpoints of isolated utterances, Bell Syst. Tech. J., 54, 297, 10.1002/j.1538-7305.1975.tb02840.x

Rao, C.R., 1965. Linear Statistical Inference and its Applications. Wiley, New York

Rissanen, J., 1989. Stochastic Complexity in Statistical Inquiry. Series in Computer Science, Vol. 15. World Scientific, Singapore

Roth R., Baker J.K., Baker J.M., Gillick L., Hunt M.J., Ito Y., Loewe S., Orloff J., Peskin B., Scattone F., 1993. Large vocabulary continuous speech recognition of wall street journal data. In: Proc. ICASSP, Vol. 2, pp. 640–643

Schwarz, 1978, Estimating the dimension of a model, Annals of Statistics, 6, 461, 10.1214/aos/1176344136

Siohan O., 1995. On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition. In: Proc. ICASSP, Vol. 1, pp. 125–128

Sun, D., 1997. “Feature dimensionality reduction using reduced-rank maximum likelihood estimation for hidden Markov models.” In: International Conference on Language and Speech, pp. 244–247

Wood L., Pearce D., Novello F., 1991. Improved vocabulary-independent sub-word HMM modelling. In: Proc. ICASSP, Vol. 1, pp. 181–184

Woodland P.C., Cole D.R., 1991. Optimising hidden markov models using discriminative output distribution. In: Proc. ICASSP, Vol. 1, pp. 545–548

Yu, G., Russell, W., Schwartz, R., Makhoul, J., 1990. Discriminant analysis and supervised vector quantization for continuous speech recognition. In: Proceedings of ICASSP, pp. 685–688, April

Zahorian S.A., Qian D., Jagharghi A.J., 1991. Acoustic-phonetic transformations for improved speaker-independent isolated word recognition. In: Proc. ICASSP, Vol. 1, pp. 561–564