thumbnail

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.

 

 

 

 

Cơ quản chủ quản:  N/A

Các bài báo tiêu biểu

Speech recognition using advanced HMM2 features
- Trang 65-68
K. Weber, S. Bengio, H. Bourlard
HMM2 is a particular hidden Markov model where state emission probabilities of the temporal (primary) HMM are modeled through (secondary) state-dependent frequency-based HMMs (see Weber, K. et al., Proc. ICSGP, vol.III, p.147-50, 2000). As we show in another paper (see Weber et al., Proc. Eurospeech, Sep. 2001), a secondary HMM can also be used to extract robust ASR features. Here, we further inve... hiện toàn bộ
#Speech recognition #Hidden Markov models #Frequency #Feature extraction #Robustness #Automatic speech recognition #Data mining #Spatial databases #Error analysis #Indexes
An open concept metric for assessing dialog system complexity
- Trang 264-267
T.M. DuBois, A.I. Rudnicky
Techniques for assessing dialog system performance commonly focus on characteristics of the interaction, using metrics such as completion, satisfaction or time on task. However, such metrics are not always capable of differentiating systems that operate on fundamentally different principles, particularly when tested on tasks that focus on common-denominator capabilities. We introduce a new metric,... hiện toàn bộ
#System performance #Computer science #System testing #Problem-solving #Equations #Performance analysis #Particle measurements #Cities and towns #History
Investigating stochastic speech understanding
- Trang 260-263
H. Bonneau-Maynard, F. Lefevre
The need for human expertise in the development of a speech understanding system can be greatly reduced by the use of stochastic techniques. However corpus-based techniques require the annotation of large amounts of training data. Manual semantic annotation of such corpora is tedious, expensive, and subject to inconsistencies. This work investigates the influence of the training corpus size on the... hiện toàn bộ
#Stochastic processes #Costs #Natural languages #Stochastic systems #Humans #Data mining #Speech analysis #Training data #Performance evaluation #Telephony
Recursive noise estimation using iterative stochastic approximation for stereo-based robust speech recognition
- Trang 81-84
Li Deng, J. Droppo, A. Acero
We present an algorithm for recursive estimation of parameters in a mildly nonlinear model involving incomplete data. In particular, we focus on the time-varying deterministic parameters of additive noise in the nonlinear model. For the nonstationary noise that we encounter in robust speech recognition, different observation data segments correspond to different noise parameter values. Hence, recu... hiện toàn bộ
#Recursive estimation #Stochastic resonance #Noise robustness #Speech enhancement #Working environment noise #Iterative algorithms #Testing #Cepstral analysis #Acoustic noise #Piecewise linear approximation
Time-varying noise compensation by sequential Monte Carlo method
- Trang 163-166
K. Yao, S. Nakamura
We present a sequential Monte Carlo method applied to additive noise compensation for robust speech recognition in time-varying noise. At each frame, the method generates a set of samples, approximating the posterior distribution of speech and noise parameters for given observation sequences to the current frame. An explicit model representing noise effects on speech features is used, so that an e... hiện toàn bộ
#Noise generators #Additive noise #Speech enhancement #Noise robustness #Speech recognition #Predictive models #State estimation #Mean square error methods #Smoothing methods #Inference algorithms
Robust speech recognition with multi-channel codebook dependent cepstral normalization (MCDCN)
- Trang 151-154
S. Deligne, R. Gopinath
We address the issue of speech recognition in the presence of interfering signals, in cases where the signals corrupting the speech are recorded in separate channels. We propose to combine a trivial form of filtering with MCDCN, a multi-channel version of codebook dependent cepstral normalization, where the cepstra of the noise are estimated from the reference signals. We report on recognition exp... hiện toàn bộ
#Robustness #Speech recognition #Cepstral analysis #Speech synthesis #Adaptive filters #Decorrelation #Filtering #Nonlinear filters #Linear systems #Acoustic noise
MLLR adaptation techniques for pronunciation modeling
- Trang 421-424
U. Venkataramani, W. Byrne
Multiple regression class MLLR (maximum likelihood linear regression) transforms are investigated for use with pronunciation models that predict variation in the observed pronunciations given the phonetic context. Regression classes can be constructed so that MLLR transforms can be estimated and used to model specific acoustic changes associated with pronunciation variation. The effectiveness of t... hiện toàn bộ
#Maximum likelihood linear regression #Automatic speech recognition #Predictive models #Dictionaries #Natural languages #Speech processing #Context modeling #Speech analysis #Surface treatment #Decision trees
Improved pronunciation modelling by inverse word frequency and pronunciation entropy
- Trang 53-56
Ming-yi Tsai, Fu-chiang Chou, Lin-shan Lee
We propose a new approach to rank the potential pronunciations for each word by their pronunciation frequency and inverse word frequency (pf-iwf) weights. The pronunciation set obtained in this way can then be pruned with different criteria. This approach not only considers the frequencies of occurrence of the pronunciations, but tries to minimize the extra confusion which may be introduced by pro... hiện toàn bộ
#Inverse problems #Frequency #Entropy #Automatic speech recognition #Vocabulary #Natural languages #Costs #Training data #Dynamic programming #Heuristic algorithms
Liên kết xu hướng trong mô hình HMM dựa trên đặc trưng phân đoạn Dịch bởi AI
- Trang 45-48
Young-Sun Yun
Chúng tôi trình bày một phương pháp giảm số lượng tham số trong mô hình HMM dựa trên đặc trưng phân đoạn (SFHMM). Nếu SFHMM cho kết quả tốt hơn CHMM, số lượng tham số sẽ lớn hơn CHMM. Do đó, cần có một cách tiếp cận mới để giảm số lượng tham số. Tương tự, quỹ đạo có thể được tách biệt thành xu hướng và vị trí. Vì xu hướng có nghĩa là sự biến đổi của các đặc trưng phân đoạn và chiếm một phần lớn củ... hiện toàn bộ
#Hidden Markov models #Speech #Polynomials #Information technology #Electronic mail #Quantization #Linear systems #Working environment noise #Gaussian distribution #Feature extraction
Speaker-trained recognition using allophonic enrollment models
- Trang 61-64
V. Yanhoucke, M.M. Hochberg, C.J. Leggetter
We introduce a method for performing speaker-trained recognition based on context-dependent allophone models from a large-vocabulary, speaker-independent recognition system. A set of speaker-enrollment templates is selected from the context-dependent allophone models. These templates are used to build representations of the speaker-enrolled utterances. The advantages of this approach include impro... hiện toàn bộ
#Context modeling #Engines #Acoustics #Databases #Testing #Degradation #Speech recognition #Natural languages #Vocabulary #Data mining