Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition - Trang 49-52
Shang-Ming Lee, Shi-Hau Fang, Jeih-weih Hung, Lin-Shan Lee
Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform
very well under most conditions, some limited efforts have been made in
optimizing the shape of the filters in the filter-bank in the conventional MFCC
approach. This paper presents a new feature extraction approach that designs the
shapes of the filters in the filter-bank. In this new approach, the filter-bank
coeffic... hiện toàn bộ
#Mel frequency cepstral coefficient #Feature extraction #Speech recognition #Shape #Filters #Principal component analysis #Additive noise #Working environment noise #Noise shaping #Cepstral analysis
Recognition of negative emotions from the speech signal - Trang 240-243
C.M. Lee, S. Narayanan, R. Pieraccini
This paper reports on methods for automatic classification of spoken utterances
based on the emotional state of the speaker. The data set used for the analysis
comes from a corpus of human-machine dialogues recorded from a commercial
application deployed by SpeechWorks. Linear discriminant classification with
Gaussian class-conditional probability distribution and k-nearest neighbors
methods are u... hiện toàn bộ
#Emotion recognition #Speech recognition #Principal component analysis #Automatic speech recognition #Speech analysis #Man machine systems #Linear discriminant analysis #Probability distribution #Statistical distributions #Frequency
Computing consensus translation from multiple machine translation systems - Trang 351-354
B. Bangalore, G. Bordel, G. Riccardi
We address the problem of computing a consensus translation given the outputs
from a set of machine translation (MT) systems. The translations from the MT
systems are aligned with a multiple string alignment algorithm and the consensus
translation is then computed. We describe the multiple string alignment
algorithm and the consensus MT hypothesis computation. We report on the
subjective and objec... hiện toàn bộ
#Natural languages #Tagging #Text categorization #Performance evaluation #Optical wavelength conversion #Impedance matching #Robustness #Speech recognition #Stochastic processes #Automatic speech recognition
Verification of multi-class recognition decision using classification approach - Trang 123-126
T. Matsui, F.K. Soong, Biing-Hwang Juang
We investigate various strategies to improve the utterance verification
performance using a 2-class pattern classifier. They include utilizing N-best
candidate scores, modifying segmentation boundaries, applying background and
out-of-vocabulary filler models, incorporating contexts, and minimizing
verification errors via discriminative training. A connected-digit database
containing utterances rec... hiện toàn bộ
#Testing #Automatic speech recognition #Natural languages #Context modeling #Databases #Microphones #Performance evaluation #Man machine systems #Degradation #Working environment noise
Task-specific adaptation of speech recognition models - Trang 433-436
A. Sankar, A. Kannan, B. Shahshahani, E. Jackson
Most published adaptation research focuses on speaker adaptation, and on
adaptation for noisy channels and background environments. We study acoustic,
grammar, and combined acoustic and grammar adaptation for creating task-specific
recognition models. Comprehensive experimental results are presented using data
from natural language quotes and a trading application. The results show that
task adapt... hiện toàn bộ
#Speech recognition #Hidden Markov models #Distributed computing #Loudspeakers #Acoustic applications #Adaptation model #Smoothing methods #Acoustic noise #Background noise #Working environment noise
Improved pronunciation modelling by inverse word frequency and pronunciation entropy - Trang 53-56
Ming-yi Tsai, Fu-chiang Chou, Lin-shan Lee
We propose a new approach to rank the potential pronunciations for each word by
their pronunciation frequency and inverse word frequency (pf-iwf) weights. The
pronunciation set obtained in this way can then be pruned with different
criteria. This approach not only considers the frequencies of occurrence of the
pronunciations, but tries to minimize the extra confusion which may be
introduced by pro... hiện toàn bộ
#Inverse problems #Frequency #Entropy #Automatic speech recognition #Vocabulary #Natural languages #Costs #Training data #Dynamic programming #Heuristic algorithms
Collaborative steering of microphone array and video camera toward multi-lingual tele-conference through speech-to-speech translation - Trang 119-122
T. Nishiura, R. Gruhn, S. Nakamura
It is very important for multilingual teleconferencing through speech-to-speech
translation to capture distant-talking speech with high quality. In addition,
the speaker image is also needed to realize a natural communication in such a
conference. A microphone array is an ideal candidate for capturing
distant-talking speech. Uttered speech can be enhanced and speaker images can be
captured by stee... hiện toàn bộ
#Microphone arrays #Collaboration #Cameras #Teleconferencing #Speech synthesis #Direction of arrival estimation #Loudspeakers #Acoustic noise #Working environment noise #Natural languages
Dialogue management in the Talk'n'Travel system - Trang 235-239
D. Stallard
A central problem for mixed-initiative dialogue management is coping with user
utterances that fall outside of the expected sequence of dialogue. Independent
initiative by the user may require a complete revision of the future course of
the dialogue, even when the system is engaged in activities of its own, such as
querying a database, etc. This paper presents an event-driven, goal-based
dialogue ... hiện toàn bộ
#Databases #Natural languages #Technology management #Prototypes #Telephony #Speech recognition #Robustness #Communication system control
Time-varying noise compensation by sequential Monte Carlo method - Trang 163-166
K. Yao, S. Nakamura
We present a sequential Monte Carlo method applied to additive noise
compensation for robust speech recognition in time-varying noise. At each frame,
the method generates a set of samples, approximating the posterior distribution
of speech and noise parameters for given observation sequences to the current
frame. An explicit model representing noise effects on speech features is used,
so that an e... hiện toàn bộ
#Noise generators #Additive noise #Speech enhancement #Noise robustness #Speech recognition #Predictive models #State estimation #Mean square error methods #Smoothing methods #Inference algorithms
Robust speech recognition with multi-channel codebook dependent cepstral normalization (MCDCN) - Trang 151-154
S. Deligne, R. Gopinath
We address the issue of speech recognition in the presence of interfering
signals, in cases where the signals corrupting the speech are recorded in
separate channels. We propose to combine a trivial form of filtering with MCDCN,
a multi-channel version of codebook dependent cepstral normalization, where the
cepstra of the noise are estimated from the reference signals. We report on
recognition exp... hiện toàn bộ
#Robustness #Speech recognition #Cepstral analysis #Speech synthesis #Adaptive filters #Decorrelation #Filtering #Nonlinear filters #Linear systems #Acoustic noise