Robust ASR using Support Vector Machines

Speech Communication - Tập 49 - Trang 253-267 - 2007
R. Solera-Ureña1, D. Martín-Iglesias1, A. Gallardo-Antolín1, C. Peláez-Moreno1, F. Díaz-de-María1
1Signal Theory and Communications Department, EPS-Universidad Carlos III de Madrid, Avda. Universidad, 30, Leganés 28911, Spain

Tài liệu tham khảo

Allwein, 2000, Reducing multiclass to binary: a unifying approach for margin classifiers, J. Mach. Learn. Res., 1, 113 Bengio, 1995 Bourlard, 1994 Burges, 1996, Simplified support vector decision rules, 71 Chih-Chung, C., Chih-Jen, L., 2004. LibSVM: a library for Support Vector Machines. Available from: <http://www.csie.ntu.edu.tw/cjlin/libsvm/>. Clarkson, 1999, On the use of Support Vector Machines for phonetic classification, IEEE Internat Conf. Acoust. Speech Signal Process., 2, 585 Collobert, R., SVMTorch: a Support Vector Machine for large-scale regression and classification problems, IDIAP. Available from: <www.idiap.ch/learning/SVMTorch.html>. Crammer, 2001, On the algorithmic implementation of multiclass kernel-based vector machines, J. Mach. Learn. Res., 2, 265 Ech-Cherif, A., Kohili, M., Benyettou, A., Benyettou, M., 2002. Lagrangian Support Vector Machines for phoneme classification. In: Proc. 9th Internat. Conf. on Neural Information Processing (ICONIP’02), Vol. 5, Singapore, pp. 2507–2511. Fine, S., Navratil, J., Gopinath, R., 2001. A hybrid GMM/SVM approach to speaker identification. In: Proc. Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Salt Lake City, UT, USA, Vol. 1, pp. 417–420. Fürnkranz, 2002, Round robin classification, J. Mach. Learn. Res., 2, 721 Ganapathiraju, A., 2002. Support vector machines for speech recognition. PhD Thesis, Mississipi State University. Ganapathiraju, A., Hamaker, J., Picone, J., 2000. Hybrid SVM/HMM architectures for speech recognition. In: Proc. 2000 Speech Transcription Workshop, Maryland, USA, Vol. 4, pp. 504–507. Ganapathiraju, 2004, Applications of support vector machines to speech recognition, IEEE Trans. Signal Process., 52, 2348, 10.1109/TSP.2004.831018 Gangashetty, S., Sekhar, C., Yegnanarayana, B., 2005. Combining evidence from multiple classifiers for recognition of consonant–vowel units of speech in multiple languages. In: Proc. Internat. Conf. on Intelligent Sensing and Information Processing, Chennai, India, pp. 387–391. García-Cabellos, J., Peláez-Moreno, C., Gallardo-Antolín, A., Pérez-Cruz, F., Díaz-de-María, F., 2004. SVM classifiers for ASR: a discussion about parameterization. In: Proc. EUSIPCO 2004, Wien, Austria, pp. 2067–2070. Glass, 2003, A probabilistic framework for segment-based speech recognition, Comput. Speech Language, 17, 137, 10.1016/S0885-2308(03)00006-8 Hamaker, J., Picone, J., 2003. Advances in speech recognition using sparse Bayesian methods, unpublished (January 2003). Hamaker , J., Picone, J., Ganapathiraju, A. 2002. A sparse modeling approach to speech recognition based on relevance vector machines. In: Proc. Internat. Conf. on Spoken Language Processing, Denver, CO, USA, Vol. 2, pp. 1001–1004. Hsu, 2002, A comparison of methods for multi-class support vector machines, IEEE Trans. Neural Networks, 13, 415, 10.1109/72.991427 Iso, K., Watanabe, T., 1990. Speaker-independent word recognition using a neural prediction model. In: Proc. Internat. Conf.e on Acoustics, Speech and Signal Processing (ICASSP), Alburquerque, NM, USA, pp. 441–444. Jaakkola, T., Haussler, D., 1998. Exploiting generative models in discriminative classifiers. Technical Report, Department of Computer Science, University of California. Available from: <citeseer.ist.psu.edu/jaakkola98exploiting.html>. Jiang, 2006, Large margin hidden Markov models for speech recognition, IEEE Trans. Audio Speech Language Process., 14, 1584, 10.1109/TASL.2006.879805 Joachims, 1999, Advances in kernel methods—support vector learning, 169 Le, Q., Bengio, S., 2003. Client dependent GMM–SVM models for speaker verification. In: Internat. Conf. on Artificial Neural Networks, ICANN/ICONIP. Springer-Verlag, Berlin, pp. 443–451. Lin, H.T., Lin, C.J., Weng, R.C., 2003. A note on Platt’s probabilistic outputs for Support Vector Machines. Technical Report, Department of computer science and information engineering, National Taiwan University. Ma, C., Randolph, M., Drish, J., 2001. A Support Vector Machines-based rejection technique for speech recognition. In: Proc. Internat. Conf. Acoustics, Speech and Signal Processing (ICASSP), Salt Lake City, UT, USA, Vol. 1, pp. 381–384. Martín-Iglesias, D., Bernal-Chaves, J., Peláez-Moreno, C., Gallardo-Antolín, A., Díaz-de-María, F., 2005. A speech recognizer based on multiclass SVMs with HMM-guided segmentation. In: Nonlinear Analyses and Algorithms for Speech Processing. Lecture Notes in Computer Science, Vol. LNAI 3817, Springer, pp. 256–266. Moreno, A., 1998. SpeechDat documentation [cd-rom], ver 1. Navia-Vázquez, 2001, Weighted least squares training of support vector classifiers leading to compact and adaptive schemes, IEEE Trans. Neural Networks, 12, 1047, 10.1109/72.950134 Osuna, E., Freund, R., Girosi, F., 1997. An improved training algorithm for Support Vector Machines. In: IEEE Workshop on Neural Networks for Signal Processing, Amelia Island, FL, USA, pp. 276–285. Platt , J.C., 1999. Probabilities for SV machines. In: Advances in Large Margin Classifiers. MIT Press, pp. 61–74. Rabiner, 1978, Considerations in dynamic time warping algorithms for discrete word recognition, IEEE Trans. Acoustics Speech Signal Process., 26, 575, 10.1109/TASSP.1978.1163164 Reichl, W., Ruske, G., 1995. A hybrid RBF-HMM system for continuous speech recognition. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Detroit, MI, USA, pp. 3335–3338. Robinson, T., Hochberg, M., Renals, S., 1995. The Use of Recurrent Neural Networks in Continuous Speech Recognition. In: Automatic Speech and Speaker Recognition—Advanced Topics, Kluwer Academic Publishers, pp. 159–184 (Chap. 19). Sakoe, H., Isotani, R., Yoshida, K., Iso, K., Watanabe, T., 1989. Speaker-independent word recognition using dynamic programming neural networks. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Glasgow, Scotland, pp. 439–442. Schölkopf, 2002 Sekhar, C., Lee, W., Takeda, K., Itakura, F., 2003. Acoustic modelling of subword units using Support Vector Machines. In: Workshop on spoken language processing, Mumbai, India. Shimodaira, 2001, Support vector machine with Dynamic Time-Alignment Kernel for speech recognition, 1841 Shimodaira, H., Noma, K., Nakai, M., 2002. Dynamic time-alignment kernel in Support Vector Machine. In: Advances in Neural Information Processing Systems 14, Vol. 2. MIT Press, Cambridge, MA, pp. 921–928. Smith, N., Gales, M., 2002. Using SVMs and discriminative models for speech recognition. In: IEEE Internat. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, Orlando, FL, USA, pp. 77–80. Smith, N., Gales, M., 2002. Speech recognition using SVMs. In: Advances in Neural Information Processing Systems 14, Vol. 14. MIT Press, Cambridge, MA, pp. 1197–1204. Smith, N., Niranjan, M., 2000. Data-dependent kernels in SVM classification of speech patterns. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP), Beijing, China, Vol. 1, pp. 297–300. Stadermann, J., Rigoll, G., 2004. A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP), Jeju Island, Korea, pp. 661–664. Tebelskis, J., Waibel, A., Petek, B., Schmidbauer, O., 1991. Continuous speech recognition using predictive neural networks. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Canada, pp. 61–64. Thubthong, 2001, Support vector machines for Thai phoneme recognition, Internat. J. Uncertainty Fuzziness Knowledge-Based Syst., 9, 803, 10.1016/S0218-4885(01)00125-3 Trentin, 2001, A survey of hybrid ANN/HMM models for automatic speech recognition, Neurocomputing, 37, 91, 10.1016/S0925-2312(00)00308-8 Vapnik, 1995 Vapnik, 1998 Varga, A., Steenneken, J., Tolimson, M., Jones, D., 1992. The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Technical Report, DRA Speech Research Unit. Vicente-Peña, 2006, Band-pass filtering of the time sequences of spectral parameters for robust wireless speech recognition, Speech Commun., 48, 1379, 10.1016/j.specom.2006.07.007 Wan, V., Renals, S., 2003. Support vector machine speaker verification methodology. In: Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Hong Kong, Vol. 2, pp. 221–224. Weiss, 1993 Weston, J., Watkins, C., 1999. Multi-class Support Vector Machines. In: M. Verleysen (Ed.), Proc. European Symposium on Artificial Neural Networks. Wu, 2004, Probability estimates for multi-class classification by pairwise coupling, J. Mach. Learn. Res., 5, 975 Young, S., 1995. HTK-Hidden Markov Model toolkit (ver 2.1), Cambridge University.