thumbnail

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.

 

 

 

 

Cơ quản chủ quản:  N/A

Các bài báo tiêu biểu

Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition
- Trang 49-52
Shang-Ming Lee, Shi-Hau Fang, Jeih-weih Hung, Lin-Shan Lee
Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coeffic...... hiện toàn bộ
#Mel frequency cepstral coefficient #Feature extraction #Speech recognition #Shape #Filters #Principal component analysis #Additive noise #Working environment noise #Noise shaping #Cepstral analysis
An examination of three classes of ASR dialogue systems: PC-based dictation, in-car systems and automated directory assistance
- Trang 455-461
M.J. Hunt
Three classes of practical speech recognition dialogue systems are considered, starting with PC-based systems, specifically dictation systems. Although such systems have become very effective, they have not achieved mainstream use. Some reasons for this disappointing outcome are proposed. Speech recognition is now appearing in production cars. It is argued that the two most attractive in-car appli...... hiện toàn bộ
#Automatic speech recognition #Speech recognition #Marketing and sales #Navigation #Telephony #Databases #Business #Application software #Computerized monitoring #Automatic control
Searching for the missing piece [speech recognition]
- Trang 230-233
W.N. Choi, Y.W. Wong, T. Lee, P.C. Ching
The tree-trellis forward-backward algorithm has been widely used for N-best searching in continuous speech recognition. In conventional approaches, the heuristic score used for the A* backward search is derived from the partial-path scores recorded during the forward pass. The inherently delayed use of a language model in the lexical tree structure leads to inefficient pruning and the partial-path...... hiện toàn bộ
#Lattices #Delay estimation #Speech #Tree data structures #Viterbi algorithm
VoiceXML 2.0 and the W3C speech interface framework
- Trang 5-8
J.A. Larson
The World Wide Web Voice Browser Working Group has released specifications for four integrated languages to developing speech applications: VoiceXML 2.0, Speech Synthesis Markup Language, Speech Recognition Grammar Markup Language, and Semantic Interpretation. These languages enable developers to specify quickly conversational speech Web applications that can be accessed by any telephone or cell p...... hiện toàn bộ
#Speech synthesis #Natural languages #Speech recognition #Engines #Markup languages #Telephony #Cellular phones #Web sites #Application software #Home appliances
The statistical approach to spoken language translation
- Trang 367-374
H. Ney
This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition; we show how the required probability distributi...... hiện toàn bộ
#Natural languages #Hidden Markov models #Probability distribution #Search problems #Performance loss
State synchronous modeling of audio-visual information for bi-modal speech recognition
- Trang 409-412
S. Nakamura, K. Kumatani, S. Tamura
There has been a higher demand recently for automatic speech recognition (ASR) systems able to operate robustly in acoustically noisy environments. This paper proposes a method to integrate audio and visual information effectively in audio-visual (bi-modal) ASR systems. Such integration inevitably necessitates modeling of the synchronization of the audio and visual information. To address the time...... hiện toàn bộ
#Speech recognition #Hidden Markov models #Automatic speech recognition #Working environment noise #Streaming media #Degradation #Spatial databases #Visual databases #Audio databases #Feature extraction
Some experiments on the use of one-channel noise reduction techniques with the Italian SpeechDat Car database
- Trang 139-142
M. Matassoni, G.A. Mian, M. Omologo, A. Santarelli, P. Svaizer
The use of noise reduction techniques for hands-free speech recognition in a car environment is investigated. A set of experiments was carried out using different speech enhancement algorithms based on noise estimation. In particular, linear spectral subtraction and MMSE estimators are considered with various parameter settings. Experiments were conducted on connected and isolated digits, extracte...... hiện toàn bộ
#Noise reduction #Speech enhancement #Working environment noise #Speech recognition #Low-frequency noise #Databases #Background noise #Road safety #Additive noise #Noise robustness
Speech interfaces for mobile communications
- Trang 93-95
H. Nakano
This paper explains speech interfaces for mobile communication. Mobile interfaces have three important design rules: do not disturb the user's main task, work within the restrictions of user's ability, and minimize the resource requirements. Social acceptance is also important. In Japan, trial and regular services with speech interfaces in mobile environments have already been launched, but they a...... hiện toàn bộ
#Mobile communication #Cellular phones #Displays #Postal services #Weather forecasting #Portals #Privacy #Working environment noise #Automatic speech recognition #Voice mail
Grammar learning for spoken language understanding
- Trang 292-295
Ye-Yi Wang, A. Acero
Many state-of-the-art conversational systems use semantic-based robust understanding and manually derived grammars, a very time-consuming and error-prone process. This paper describes a machine-aided grammar authoring system that enables a programmer to develop rapidly a high quality grammar for conversational systems. This is achieved with a combination of domain-specific semantics, a library gra...... hiện toàn bộ
#Natural languages #Robustness #Authoring systems #Programming profession #Libraries #Information systems #Computer errors #Writing #Law #Legal factors
Author index
- Trang 467-468 - 2001
The author index contains an entry for each author and coauthor included in the proceedings record.