Multi-frame GMM-based block quantisation of line spectral frequencies

Speech Communication - Tập 47 - Trang 265-276 - 2005
Stephen So1, Kuldip K. Paliwal1
1School of Microelectronic Engineering, Griffith University, Nathan Campus, Brisbane QLD 4111, Australia

Tài liệu tham khảo

Atal, 1979, Predictive coding of speech signals and subjective error criteria, IEEE Trans. Acoust., Speech, Signal Process., ASSP-27, 247, 10.1109/TASSP.1979.1163237 Campbell, Jr., J.P., Welch, V.C., Tremain, T.E., 1989. An expandable error-protected 4800 bps CELP Coder (U.S. Federal Standard 4800 bps voice coder). In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Glasgow, Scotland, May 1989, pp. 735–738. Dempster, 1977, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Stat. Soc., 39, 1 Gardner, 1995, Theoretical analysis of the high-rate vector quantization of LPC parameters, IEEE Trans. Speech Audio Process., 3, 367, 10.1109/89.466658 Gersho, 1992 Gray, 1976, Quantization and bit allocation in speech processing, IEEE Trans. Acoust., Speech, Signal Process., ASSP-24, 459, 10.1109/TASSP.1976.1162857 Hedelin, 2000, Vector quantization based on Gaussian mixture models, IEEE Trans. Speech Audio Process., 8, 385, 10.1109/89.848220 Huang, 1963, Block quantization of correlated Gaussian random variables, IEEE Trans. Commun. Syst., CS-11, 289, 10.1109/TCOM.1963.1088759 Itakura, 1975, Line spectrum representation of linear predictive coefficients of speech signals, J. Acoust. Soc. Am., 57, S35, 10.1121/1.1995189 Itakura, 1969, Speech analysis-synthesis based on the partial autocorrelation coefficient, Proc. JSA, 199 Kroon, 1995, Linear-prediction based analysis-by-synthesis coding, 79 LeBlanc, 1993, Efficient search and design procedures for robust multi-stage VQ of LPC parameters for 4kb/s speech coding, IEEE Trans. Speech Audio Process., 1, 373, 10.1109/89.242483 Linde, 1980, An algorithm for vector quantizer design, IEEE Trans. Commun., COM-28, 84, 10.1109/TCOM.1980.1094577 Nurminen, J., 2003. Multi-mode quantization of adjacent speech parameters using a low-complexity prediction scheme. In: Proc. EuroSpeech, September 2003, pp. 1073–1076. Paliwal, 1993, Efficient vector quantization of LPC parameters at 24 bits/frame, IEEE Trans. Speech Audio Process., 1, 3, 10.1109/89.221363 Paliwal, 1995, Quantization of LPC parameters, 443 Paliwal, K.K., So, S., 2004. Multiple frame block quantisation of line spectral frequencies using Gaussian mixture models. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Montreal, pp. I-149–I-152. Proakis, 1996 Subramaniam, A.D., Rao, B.D., 2000. PDF optimized parametric vector quantization with applications to speech coding. In: 34th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, November 2000. Subramanian, A.D., Rao, B.D., 2001. Speech LSF quantization with rate independent complexity, bit scalability and learning. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, vol. 2, pp. 705–708. Subramaniam, 2003, PDF optimized parametric vector quantization of speech line spectral frequencies, IEEE Trans. Speech Audio Process., 11, 130, 10.1109/TSA.2003.809192 Shabestary, T.Z., Hedelin, P., 2002. Spectral quantization by companding. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 1, pp. 641–644. Sinervo, U., Nurminen, J., Heikkinen, A., Saarinen, J., 2003. Multi-mode matrix quantizer for low bit rate LSF quantization. In: Proc. EuroSpeech, September 2003, pp. 1073–1076. Soong, F.K., Juang, B.H., 1984. Line spectrum pair (LSP) and speech data compression. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, San Diego, California, March 1984, pp. 37–40. Sugamura, 1986, Speech analysis and synthesis methods developed at ECL in NTT–from LPC to LSP–, Speech Commun., 5, 199, 10.1016/0167-6393(86)90008-7 Tsao, 1985, Matrix quantizer design for LPC speech using the generalized Lloyd algorithm, IEEE Trans. Acoust., Speech, Signal Process., ASSP-33, 537, 10.1109/TASSP.1985.1164584 Viswanathan, 1975, Quantization properties of transmission parameters in linear predictive systems, IEEE Trans. Acoust., Speech, Signal Process., ASSP-23, 309, 10.1109/TASSP.1975.1162675 Xydeas, 1999, Split matrix quantization of LPC parameters, IEEE Trans. Speech Audio Process., 7, 113, 10.1109/89.748117