Distributed speech recognition with codec parameters
Tóm tắt
Communication devices which perform distributed speech recognition (DSR) tasks currently transmit standardized coded parameters of speech signals. Recognition features are extracted from signals reconstructed using these on a remote server. Since reconstruction losses degrade recognition performance, proposals are being considered to standardize DSR-codecs which derive recognition features, to be transmitted and used directly for recognition. However, such a codec must be embedded on the transmitting device, along with its current standard codec. Performing recognition using codec bitstreams avoids these complications: no additional feature-extraction mechanism is required on the device, and there are no reconstruction losses on the server. We propose an LDA-based method for extracting optimal feature sets from codec bitstreams and demonstrate that features so derived result in improved recognition performance for the LPC, GSM and CELP codecs. For GSM and CELP, we show that the performance is comparable to that with uncoded speech and standard DSR-codec features.
Từ khóa
#Speech recognition #Codecs #Feature extraction #GSM #Propagation losses #Degradation #Performance loss #Proposals #Code standards #Linear predictive codingTài liệu tham khảo
huerta, 1998, Speech Recognition from GSM codec parameters, Proc ICSLP 1998
gallardo-antolin, 1998, Recognition from GSM digital signal, Proc ICSLP 1998
duda, 2001, Pattern Classification
kim, 2000, Bitstream-based feature extraction for wireless speech recognition, Proc ICASSP 2000
0
2000, Distributed Speech Recognition; Front-end feature extraction algorithm; Compression algorithms, Document ETSI E8201 108 V1 1 2 (2000-04)
10.1049/el:19980101
0
10.1109/ICSLP.1996.607278