Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition
Tóm tắt
Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coefficients are data-driven and obtained by applying principal component analysis (PCA) to the FFT spectrum of the training data. The experimental results show that this method is robust under noisy environment and is well additive with other noise-handling techniques.
Từ khóa
#Mel frequency cepstral coefficient #Feature extraction #Speech recognition #Shape #Filters #Principal component analysis #Additive noise #Working environment noise #Noise shaping #Cepstral analysisTài liệu tham khảo
yim, 2000, Auditory Spectrum Based features (ASBF) for Robust Speech Recognition, ICSLP
saon, 2000, Minimum Bayes error feature selection, ICSLP
jolliffe, 1986, Principal Component Analysis, 10.1007/978-1-4757-1904-8
10.1016/S0167-6393(98)00061-2
0
10.1109/ICASSP.1993.319393
ghitza, 1991, Auditory Nerve Representations as a Basis for Speech Recognition
rabiner, 1993, Fundamentals of Speech Recognition
10.1109/ICASSP.1998.675351
biern, 2001, An Application of Discriminative Feature Extraction to Filter-Bank-Based Speech Recognition, IEEE Transactions on Speech and Audio Processing, 9
dernuynck, 1999, Optimal feature sub-space selection based on discriminant analysis, Eurospeech