Phoneme recognition using ICA-based feature extraction and transformation

Signal Processing - Tập 84 - Trang 1005-1019 - 2004

Oh-Wook Kwon¹, Te-Won Lee²

¹School of Electrical and Computer Engineering, Chungbuk National University, 48 Gaesin-dong, Heungdeok-gu, Cheongju, Chungbuk 361-763, South Korea

²Institute for Neural Computation, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0523, USA

Tài liệu tham khảo

S. Amari, Neural learning in structured parameter spaces—natural Riemannian gradient, in: Advances in Neural Information Processing System, Vol. 9, MIT Press, Cambridge, MA, 1997, pp. 127–133. Bell, 1995, An information-maximization approach to blind separation and blind deconvolution, Neural Comput., 7, 1129, 10.1162/neco.1995.7.6.1129 Bell, 1996, Learning the higher-order structure of a natural sound, Network Comput. Neural Syst., 7, 261, 10.1088/0954-898X/7/2/005 Bell, 1997, The ‘independent components’ of natural scenes are edge filters, Vision Res., 37, 3327, 10.1016/S0042-6989(97)00121-1 Box, 1992 ETSI Standard, Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms, ETSI ES 202 050 v1.1.1, October 2002. T. Fukuda, et al., Peripheral features for HMM-based speech recognition, in: Proceedings of the International Conference on Acoustics, Speech, Signal Processing, Salt Lake City, UT, 2001. R. Gemello, et al., Integration of fixed and multiple resolution analysis in a speech recognition system, in: Proceedings of the International Conference on Acoustics, Speech, Signal Processing, Salt Lake City, UT, 2001. H. Hermansky, et al., RASTA-PLP speech analysis technique, in: Proceedings of the International Conference Acoustics, Speech, Signal Processing, San Francisco, CA, March 1992, pp. 1121–1124. Hyvärinen, 2001, Topographic independent component analysis, Neural Comput., 13, 1527, 10.1162/089976601750264992 Hyvärinen, 2001 G.-J. Jang, T.-W. Lee, A probabilistic approach to single channel blind signal separation, in: Advances in Neural Information Processing Systems, 15, MIT Press, Cambridge, MA, 2003. G.-J. Jang, S.-J. Yun, Y.-H. Oh, Feature vector transformation using ICA and its application to speaker verification, in: Proceedings of the EUROSPEECH 99, Budapest, Hungary, September 1999, pp. 767–770. S. Kajarekar, et al., A study of two dimensional linear discriminants for ASR, in: Proceedings of the International Conference on Acoustics, Speech, Signal Processing Salt Lake City, UT, 2001. Lee, 1998 Lee, 1989, Speaker-independent phone recognition using hidden Markov models, IEEE Trans. Acoust. Speech, Signal Process., 37, 1641, 10.1109/29.46546 J.H. Lee, H.Y. Jung, T.W. Lee, S.Y. Lee, Speech feature extraction using independent component analysis, in: Proceedings of the International Conference Acoustics, Speech, Signal Processing, Istanbul, Turkey, June 2000, pp. 1631–1634. Lee, 2002, On the efficient speech feature extraction based on independent component analysis, Neural Process. Lett., 15, 235, 10.1023/A:1015777200976 Lewicki, 2002, Efficient coding of natural sounds, Nat. Neurosci., 5, 356, 10.1038/nn831 Olshausen, 1996, Emergence of simple-cell receptive field properties by learning a sparse code for natural images, Nature, 381, 607, 10.1038/381607a0 O'Shaughnessy, 1999 L. Parra, C. Spence, P. Sajda, Higher-order statistical properties arising from the non-stationarity of natural signals, in: Advances in Neural Information Processing Systems, Vol. 13, MIT Press, Cambridge, MA, 2001. Potamitis, 2000, Independent component analysis applied to feature extraction for robust automatic speech recognition, Electron. Lett., 36, 1977, 10.1049/el:20001365 O. Schwartz, E.E. Simoncelli, Natural sound statistics and divisive normalization in the auditory system, in: Advances in Neural Information Processing Systems, Vol. 13, MIT Press, Cambridge, MA, 2001. P. Somervuo, Experiments with linear and nonlinear feature transformations in HMM based phone recognition, in: Proceedings of the International Conference on Acoustics, Speech, Signal Processing, Hong Kong, China, April 2003, pp. I-52–I.55. S.J. Young, The general use of tying in phoneme-based HMM speech recognisers, in: Proceedings of the International Conference on Acoustics, Speech, Signal Processing, San Francisco, CA, March 1992, pp. 1569–1572. S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, P. Woodland, The HTK Book, Cambridge University Engineering Department, Cambridge, UK, 2002. Q. Zhu, A. Alwan, An efficient and scalable 2D DCT-based feature coding scheme for remote speech recognition, in: Proceedings of the International Conference on Acoustics, Speech, Signal Processing, Salt Lake City, UT, 2001. R.E. Ziemer, W.H. Tranter, Principles of Communications: Systems, Modulation, and Noise, 5th Edition, Wiley, New York, 2002, pp. 76–83.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA