thumbnail

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.

 

 

 

 

Cơ quản chủ quản:  N/A

Các bài báo tiêu biểu

Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition
- Trang 49-52
Shang-Ming Lee, Shi-Hau Fang, Jeih-weih Hung, Lin-Shan Lee
Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coeffic...... hiện toàn bộ
#Mel frequency cepstral coefficient #Feature extraction #Speech recognition #Shape #Filters #Principal component analysis #Additive noise #Working environment noise #Noise shaping #Cepstral analysis
Statistical learning of language pronunciation structure
- Trang 339-342
F. Korkmazskiy
This paper presents a new approach to rule based pronunciation generation. The system presented can automatically learn a new language pronunciation structure and use this knowledge for pronunciation generation for an arbitrary context sensitive language. Unlike conventional text-to-speech systems which are based on the cost expensive human expert knowledge about a specific language, this system c...... hiện toàn bộ
#Statistical learning #Speech synthesis #Humans #Dictionaries #Databases #Natural languages #Speech recognition #Multimedia communication #Costs #Decision trees
Adaptive training for robust ASR
- Trang 15-20
M.J.F. Gales
Adaptive training is a powerful training technique for building speech recognition systems on nonhomogeneous data. The aim is to remove unwanted variability, such as changes in speaker, channel or acoustic environment, from desired changes, the acoustic differences between words. During training, two sets of models are generated: a canonical model set for the desired "true" variability of the spee...... hiện toàn bộ
#Robustness #Automatic speech recognition #Loudspeakers #Speech recognition #Training data #Target recognition #Feature extraction #Acoustical engineering #Data engineering #Power engineering and energy
Computing consensus translation from multiple machine translation systems
- Trang 351-354
B. Bangalore, G. Bordel, G. Riccardi
We address the problem of computing a consensus translation given the outputs from a set of machine translation (MT) systems. The translations from the MT systems are aligned with a multiple string alignment algorithm and the consensus translation is then computed. We describe the multiple string alignment algorithm and the consensus MT hypothesis computation. We report on the subjective and objec...... hiện toàn bộ
#Natural languages #Tagging #Text categorization #Performance evaluation #Optical wavelength conversion #Impedance matching #Robustness #Speech recognition #Stochastic processes #Automatic speech recognition
Improvements on a semi-automatic grammar induction framework
- Trang 288-291
Chin-Chung Wong, H. Meng
This work extends the semi-automatic grammar induction approach previously proposed (see Meng, H. and Siu, K.C., IEEE Trans. on Knowledge and Data Engineering). The data-driven approach learns semantic and phrasal categories from a training corpus of unannotated natural language queries in a specific domain. The approach can be seeded with prespecified semantic categories to expedite the learning ...... hiện toàn bộ
#Natural languages #Testing #Equations #Databases #Laboratories #Systems engineering and theory #Research and development management #Scalability #Humans #Speech recognition
Pseudo 2-dimensional hidden Markov models in speech recognition
- Trang 441-444
S. Werner, G. Rigoll
In this paper, the usage of pseudo 2-dimensional hidden Markov models for speech recognition is discussed. This image processing method should better model the time-frequency structure in speech signals. The method calculates the emission probability of a standard HMM by embedded HMM for each state. If a temporal sequence of spectral vectors is imagined as a spectrogram, this leads to a 2-dimensio...... hiện toàn bộ
#Hidden Markov models #Speech recognition #Feature extraction #Image processing #Speech processing #Spectrogram #Computer science #Signal processing #Frequency #Databases
MLLR adaptation techniques for pronunciation modeling
- Trang 421-424
U. Venkataramani, W. Byrne
Multiple regression class MLLR (maximum likelihood linear regression) transforms are investigated for use with pronunciation models that predict variation in the observed pronunciations given the phonetic context. Regression classes can be constructed so that MLLR transforms can be estimated and used to model specific acoustic changes associated with pronunciation variation. The effectiveness of t...... hiện toàn bộ
#Maximum likelihood linear regression #Automatic speech recognition #Predictive models #Dictionaries #Natural languages #Speech processing #Context modeling #Speech analysis #Surface treatment #Decision trees
Example-based query generation for spontaneous speech
- Trang 268-271
H. Murao, N. Kawaguchi, S. Matsubara, Y. Inagaki
This paper proposes a new query generation method that is based on examples of human-to-human dialogue. Along with modeling the information flow in dialogue, a system for information retrieval in-car has been designed. The system refers to the dialogue corpus to find an example that is similar to input speech, and makes a query from the example. We also give the experimental results to show the ef...... hiện toàn bộ
#Speech #Information retrieval #Databases #Humans #Acoustical engineering #Natural languages #Robustness
Very large vocabulary proper name recognition for directory assistance
- Trang 222-225
F. Bechet, R. de Mori, G. Subsol
This paper deals with the difficult task of recognition of a large vocabulary of proper names in a directory assistance application. After a presentation of the related work, it introduces a methodology for rescoring the N-best hypotheses generated by a first step recognition. First experiments give encouraging results and several topics for future research are presented.
#Vocabulary #Acoustic distortion #Speech recognition #Error analysis #Lattices #Research and development #Robustness #Automatic speech recognition #Hidden Markov models
Evaluating long-term spectral subtraction for reverberant ASR
- Trang 103-106
D. Gelbart, N. Morgan
Even a modest degree of room reverberation can greatly increase the difficulty of automatic speech recognition. We have observed large increases in speech recognition word error rates when using a far-field (3-6 feet) microphone in a conference room, in comparison with recordings from head-mounted microphones. In this paper, we describe experiments with a proposed remedy based on the subtraction o...... hiện toàn bộ
#Automatic speech recognition #Spectral analysis #Reverberation #Speech recognition #Microwave integrated circuits #Cepstral analysis #Computer science #Error analysis #Absorption #Fourier transforms