thumbnail

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.

 

 

 

Cơ quản chủ quản:  N/A

Lĩnh vực:

Các bài báo tiêu biểu

Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition
- Trang 49-52
Shang-Ming Lee, Shi-Hau Fang, Jeih-weih Hung, Lin-Shan Lee
Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coeffic...... hiện toàn bộ
#Mel frequency cepstral coefficient #Feature extraction #Speech recognition #Shape #Filters #Principal component analysis #Additive noise #Working environment noise #Noise shaping #Cepstral analysis
An improved union model for continuous speech recognition with partial duration corruption
- Trang 25-28
Ji Ming
The probabilistic union model is improved for continuous speech recognition involving partial duration corruption, assuming no knowledge about the corrupting noise. The new developments include: an n-best rescoring strategy for union based continuous speech recognition; a dynamic segmentation algorithm for reducing the number of corrupted segments in the union model; a combination of the union mod...... hiện toàn bộ
#Speech recognition #Acoustic noise #Noise reduction #Speech enhancement #Signal to noise ratio #Time varying systems #Redundancy #Computer science #Heuristic algorithms #System testing
Task-specific adaptation of speech recognition models
- Trang 433-436
A. Sankar, A. Kannan, B. Shahshahani, E. Jackson
Most published adaptation research focuses on speaker adaptation, and on adaptation for noisy channels and background environments. We study acoustic, grammar, and combined acoustic and grammar adaptation for creating task-specific recognition models. Comprehensive experimental results are presented using data from natural language quotes and a trading application. The results show that task adapt...... hiện toàn bộ
#Speech recognition #Hidden Markov models #Distributed computing #Loudspeakers #Acoustic applications #Adaptation model #Smoothing methods #Acoustic noise #Background noise #Working environment noise
Finite-state transducers for speech-input translation
- Trang 375-380
F. Casacuberta
Nowadays, hidden Markov models (HMMs) and n-grams are the basic components of the most successful speech recognition systems. In such systems, HMMs (the acoustic models) are integrated into a n-gram or a stochastic finite-state grammar (the language model). Similar models can be used for speech translation, and HMMs (the acoustic models) can be integrated into a finite-state transducer (the transl...... hiện toàn bộ
#Hidden Markov models #Acoustic transducers #Stochastic processes #Speech recognition #Stochastic systems #Telephony #Prototypes #Search engines #Natural languages #Decoding
Computing consensus translation from multiple machine translation systems
- Trang 351-354
B. Bangalore, G. Bordel, G. Riccardi
We address the problem of computing a consensus translation given the outputs from a set of machine translation (MT) systems. The translations from the MT systems are aligned with a multiple string alignment algorithm and the consensus translation is then computed. We describe the multiple string alignment algorithm and the consensus MT hypothesis computation. We report on the subjective and objec...... hiện toàn bộ
#Natural languages #Tagging #Text categorization #Performance evaluation #Optical wavelength conversion #Impedance matching #Robustness #Speech recognition #Stochastic processes #Automatic speech recognition
Pseudo 2-dimensional hidden Markov models in speech recognition
- Trang 441-444
S. Werner, G. Rigoll
In this paper, the usage of pseudo 2-dimensional hidden Markov models for speech recognition is discussed. This image processing method should better model the time-frequency structure in speech signals. The method calculates the emission probability of a standard HMM by embedded HMM for each state. If a temporal sequence of spectral vectors is imagined as a spectrogram, this leads to a 2-dimensio...... hiện toàn bộ
#Hidden Markov models #Speech recognition #Feature extraction #Image processing #Speech processing #Spectrogram #Computer science #Signal processing #Frequency #Databases
ETUDE, a recursive dialog manager with embedded user interface patterns
- Trang 244-247
R. Pieraccini, S. Caskey, K. Dayanidhi, B. Carpenter, M. Phillips
We describe ETUDE, a dialog manager that supports recursive descriptions of the dialog flow in spoken dialog applications. We also introduce the notion of user interface patterns, i.e. those dialog patterns that are frequently used in applications. We then describe how these patterns can be built into the dialog manager engine in order to facilitate the design and development of complex applicatio...... hiện toàn bộ
#User interfaces #Telephony #Control systems #Costs #Winches #Engines #Logic #Navigation #Usability #Design automation
A comparative study of model-based adaptation techniques for a compact speech recognizer
- Trang 29-32
F. Thiele, R. Bippus
Many techniques for speaker adaptation have been successfully applied to automatic speech recognition. This paper compares the performance of several adaptation methods with respect to their memory need and processing demand. For adaptation of a compact acoustic model with 4k densities, eigenvoices and structural MAP (SMAP) are investigated next to the well-known techniques of MAP (maximum a poste...... hiện toàn bộ
#Adaptation model #Speech recognition #Loudspeakers #Automatic speech recognition #Maximum likelihood linear regression #Laboratories #Error analysis #Command and control systems #Degradation #Regression tree analysis
Simultaneous recognition of distant talking speech of multiple sound sources based on 3-D N-best search algorithm
- Trang 111-114
P. Heracleous, S. Nakamura, K. Shikano
This paper deals with the simultaneous recognition of distant-talking speech of multiple talkers using the 3D N-best search algorithm. We describe the basic idea of the 3D N-best search and we address two additional techniques implemented into the baseline system. Namely, a path distance-based clustering and a likelihood normalization technique appeared to be necessary in order to build an efficie...... hiện toàn bộ
#Speech recognition #Hidden Markov models #Viterbi algorithm #Search methods #Natural languages #Clustering algorithms #Reverberation #Feature extraction #Adaptive systems #Sorting
ASR in portable wireless devices
- Trang 96-102
O. Viikki
This paper discusses the applicability and role of automatic speech recognition in portable wireless devices. Due to the author's background, the viewpoints are somewhat biased to mobile telephones, but many of the aspects are nevertheless common for other portable devices as well. While still dominated by the speaker-dependent technology, there are today signs that also in wireless devices, there...... hiện toàn bộ
#Automatic speech recognition #Costs #Embedded system #Acoustic noise #Noise robustness #Telephony #Mobile communication #Speech recognition #Acoustic devices #Adaptation model