Time-varying noise compensation by sequential Monte Carlo method
Tóm tắt
We present a sequential Monte Carlo method applied to additive noise compensation for robust speech recognition in time-varying noise. At each frame, the method generates a set of samples, approximating the posterior distribution of speech and noise parameters for given observation sequences to the current frame. An explicit model representing noise effects on speech features is used, so that an extended Kalman filter is constructed for each sample, generating an updated continuous state as the estimation of the noise parameter, and prediction likelihood as the weight of each sample for minimum mean square error inference of the time-varying noise parameter over these samples. A selection step and a smoothing step are used to improve efficiency. Through experiments, we observed significant performance improvement over that achieved by noise compensation with a stationary noise assumption. It also performed better than the sequential EM algorithm in machine-gun noise.
Từ khóa
#Noise generators #Additive noise #Speech enhancement #Noise robustness #Speech recognition #Predictive models #State estimation #Mean square error methods #Smoothing methods #Inference algorithmsTài liệu tham khảo
zhao, 2001, Recursive estimation of time-varying environments for robust speech recognition, ICASSP
kim, 1998, Nonstationary environment compensation based on sequential estimation, IEEE Signal Processing Letters, 5
deng, 2000, Large-vocabulary speech recognition under adverse acoustic environments, ICSLP, 806
frey, 2001, Algonquin: Iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition, Eurospeech
yao, 2001, Sequential noise compensation by a sequential kullback proximal algorithm, EU-ROSPEECH
yao, 2000, Residual noise compensation by a sequential em algorithm for robust speech recognition in nonstationary noise, ICSLP, 1, 770
10.2307/2669847
10.1109/ICASSP.2000.861848
10.1093/biomet/57.1.97
10.1109/ICASSP.1990.115970
