thumbnail

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.

 

 

 

 

Cơ quản chủ quản:  N/A

Lĩnh vực:

Các bài báo tiêu biểu

A comparative study of model-based adaptation techniques for a compact speech recognizer
- Trang 29-32
F. Thiele, R. Bippus
Many techniques for speaker adaptation have been successfully applied to automatic speech recognition. This paper compares the performance of several adaptation methods with respect to their memory need and processing demand. For adaptation of a compact acoustic model with 4k densities, eigenvoices and structural MAP (SMAP) are investigated next to the well-known techniques of MAP (maximum a poste...... hiện toàn bộ
#Adaptation model #Speech recognition #Loudspeakers #Automatic speech recognition #Maximum likelihood linear regression #Laboratories #Error analysis #Command and control systems #Degradation #Regression tree analysis
Simultaneous recognition of distant talking speech of multiple sound sources based on 3-D N-best search algorithm
- Trang 111-114
P. Heracleous, S. Nakamura, K. Shikano
This paper deals with the simultaneous recognition of distant-talking speech of multiple talkers using the 3D N-best search algorithm. We describe the basic idea of the 3D N-best search and we address two additional techniques implemented into the baseline system. Namely, a path distance-based clustering and a likelihood normalization technique appeared to be necessary in order to build an efficie...... hiện toàn bộ
#Speech recognition #Hidden Markov models #Viterbi algorithm #Search methods #Natural languages #Clustering algorithms #Reverberation #Feature extraction #Adaptive systems #Sorting
Author index
- Trang 467-468 - 2001
The author index contains an entry for each author and coauthor included in the proceedings record.
Grammar learning for spoken language understanding
- Trang 292-295
Ye-Yi Wang, A. Acero
Many state-of-the-art conversational systems use semantic-based robust understanding and manually derived grammars, a very time-consuming and error-prone process. This paper describes a machine-aided grammar authoring system that enables a programmer to develop rapidly a high quality grammar for conversational systems. This is achieved with a combination of domain-specific semantics, a library gra...... hiện toàn bộ
#Natural languages #Robustness #Authoring systems #Programming profession #Libraries #Information systems #Computer errors #Writing #Law #Legal factors
Statistical learning of language pronunciation structure
- Trang 339-342
F. Korkmazskiy
This paper presents a new approach to rule based pronunciation generation. The system presented can automatically learn a new language pronunciation structure and use this knowledge for pronunciation generation for an arbitrary context sensitive language. Unlike conventional text-to-speech systems which are based on the cost expensive human expert knowledge about a specific language, this system c...... hiện toàn bộ
#Statistical learning #Speech synthesis #Humans #Dictionaries #Databases #Natural languages #Speech recognition #Multimedia communication #Costs #Decision trees
Adaptive training for robust ASR
- Trang 15-20
M.J.F. Gales
Adaptive training is a powerful training technique for building speech recognition systems on nonhomogeneous data. The aim is to remove unwanted variability, such as changes in speaker, channel or acoustic environment, from desired changes, the acoustic differences between words. During training, two sets of models are generated: a canonical model set for the desired "true" variability of the spee...... hiện toàn bộ
#Robustness #Automatic speech recognition #Loudspeakers #Speech recognition #Training data #Target recognition #Feature extraction #Acoustical engineering #Data engineering #Power engineering and energy
Recognition of negative emotions from the speech signal
- Trang 240-243
C.M. Lee, S. Narayanan, R. Pieraccini
This paper reports on methods for automatic classification of spoken utterances based on the emotional state of the speaker. The data set used for the analysis comes from a corpus of human-machine dialogues recorded from a commercial application deployed by SpeechWorks. Linear discriminant classification with Gaussian class-conditional probability distribution and k-nearest neighbors methods are u...... hiện toàn bộ
#Emotion recognition #Speech recognition #Principal component analysis #Automatic speech recognition #Speech analysis #Man machine systems #Linear discriminant analysis #Probability distribution #Statistical distributions #Frequency
Internet evolution and progress in full automatic French language modelling
- Trang 363-366
D. Vaufreydaz, M. Gery
The World Wide Web is the greatest information space ever seen, distributed all over the world, in many languages, on many various topics. We first describe the evolution of a French subset of this space during the last 3 years. During this time, the size of automatically extracted text for language modelling has multiplied by 6.5. Moreover, French coverage has grown from 140,000 to 200,000 lexica...... hiện toàn bộ
#Internet #Natural languages #Speech recognition #Web server #Robots #HTML #Web sites #Data mining #Crawlers #Stochastic processes
The symbiosis of DSP and speech recognition or an outsider's view of the inside
- Trang 1-4
J.F. Kaiser
From an historical review of how we got to where we are now, we discuss the interrelationship between our system design objectives and goals, our modeling of the speech signal and its generation and parameterization, and the broadly developing DSP methodology. We take a critical look at some of the underlying assumptions in. our modeling to see if they may be limiting the performance that can be o...... hiện toàn bộ
#Symbiosis #Digital signal processing #Speech recognition #Automatic speech recognition #Telephony #Speech synthesis #Physics #Laboratories #Speech coding #Mathematical model
MLLR adaptation techniques for pronunciation modeling
- Trang 421-424
U. Venkataramani, W. Byrne
Multiple regression class MLLR (maximum likelihood linear regression) transforms are investigated for use with pronunciation models that predict variation in the observed pronunciations given the phonetic context. Regression classes can be constructed so that MLLR transforms can be estimated and used to model specific acoustic changes associated with pronunciation variation. The effectiveness of t...... hiện toàn bộ
#Maximum likelihood linear regression #Automatic speech recognition #Predictive models #Dictionaries #Natural languages #Speech processing #Context modeling #Speech analysis #Surface treatment #Decision trees