Histogram based normalization in the acoustic feature space

S. Molau1, M. Pitz1, H. Ney1
1Lehrstuhl fiir Informatik VI, Computer Science Department, RWTH-Aachen-University of Technology, Aachen, Germany

Tóm tắt

We describe a technique called histogram normalization that aims at normalizing feature space distributions at different stages in the signal analysis front-end, namely the log-compressed filterbank vectors, cepstrum coefficients, and LDA (local density approximation) transformed acoustic vectors. Best results are obtained at the filterbank, and in most cases there is a minor additional gain when normalization is applied sequentially at different stages. We show that histogram normalization performs best if applied both in training and recognition, and that smoothing the target histogram obtained on the training data is also helpful. On the VerbMobil II corpus, a German large-vocabulary conversational speech recognition task, we achieve an overall reduction in word error rate of about 10% relative.

Từ khóa

#Histograms #Filter bank #Signal analysis #Cepstrum #Linear discriminant analysis #Target recognition #Smoothing methods #Training data #Speech recognition #Error analysis

Tài liệu tham khảo

hilger, 2001, Quantile Based Histogram Equalization for Noise Robust Speech Recognition, Proc European Conf on Speech Communication and Technology gopinath, 2000, Gaussianization, IMA Workshop Mathematical Foundations of Speech Processing and Recognition 10.1109/ICASSP.2000.862071 10.1006/csla.1995.0010 10.1109/ICASSP.1996.541105 mirghafori, 1995, Fast Speakers in Large Vocabulary Continuous Speech Recognition: Analysis & Antidotes, Proc European Conf on Speech Communication and Technology, 491 10.3115/1075671.1075688 10.1109/ICASSP.1996.541103 10.1109/ICASSP.1998.675399 dharanipragada, 2000, UA Nonlinear Unsupervised Adaptation Technique for Speech Recognition, Proc Int Conf on Spoken Language Processing, 556