Shang-Ming Lee, Shi-Hau Fang, Jeih-weih Hung, Lin-Shan Lee
Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coeffic...... hiện toàn bộ
Three classes of practical speech recognition dialogue systems are considered, starting with PC-based systems, specifically dictation systems. Although such systems have become very effective, they have not achieved mainstream use. Some reasons for this disappointing outcome are proposed. Speech recognition is now appearing in production cars. It is argued that the two most attractive in-car appli...... hiện toàn bộ
#Automatic speech recognition #Speech recognition #Marketing and sales #Navigation #Telephony #Databases #Business #Application software #Computerized monitoring #Automatic control
The tree-trellis forward-backward algorithm has been widely used for N-best searching in continuous speech recognition. In conventional approaches, the heuristic score used for the A* backward search is derived from the partial-path scores recorded during the forward pass. The inherently delayed use of a language model in the lexical tree structure leads to inefficient pruning and the partial-path...... hiện toàn bộ
#Lattices #Delay estimation #Speech #Tree data structures #Viterbi algorithm
The World Wide Web Voice Browser Working Group has released specifications for four integrated languages to developing speech applications: VoiceXML 2.0, Speech Synthesis Markup Language, Speech Recognition Grammar Markup Language, and Semantic Interpretation. These languages enable developers to specify quickly conversational speech Web applications that can be accessed by any telephone or cell p...... hiện toàn bộ
#Speech synthesis #Natural languages #Speech recognition #Engines #Markup languages #Telephony #Cellular phones #Web sites #Application software #Home appliances
This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition; we show how the required probability distributi...... hiện toàn bộ
#Natural languages #Hidden Markov models #Probability distribution #Search problems #Performance loss
There has been a higher demand recently for automatic speech recognition (ASR) systems able to operate robustly in acoustically noisy environments. This paper proposes a method to integrate audio and visual information effectively in audio-visual (bi-modal) ASR systems. Such integration inevitably necessitates modeling of the synchronization of the audio and visual information. To address the time...... hiện toàn bộ
M. Matassoni, G.A. Mian, M. Omologo, A. Santarelli, P. Svaizer
The use of noise reduction techniques for hands-free speech recognition in a car environment is investigated. A set of experiments was carried out using different speech enhancement algorithms based on noise estimation. In particular, linear spectral subtraction and MMSE estimators are considered with various parameter settings. Experiments were conducted on connected and isolated digits, extracte...... hiện toàn bộ
This paper explains speech interfaces for mobile communication. Mobile interfaces have three important design rules: do not disturb the user's main task, work within the restrictions of user's ability, and minimize the resource requirements. Social acceptance is also important. In Japan, trial and regular services with speech interfaces in mobile environments have already been launched, but they a...... hiện toàn bộ
#Mobile communication #Cellular phones #Displays #Postal services #Weather forecasting #Portals #Privacy #Working environment noise #Automatic speech recognition #Voice mail
Many state-of-the-art conversational systems use semantic-based robust understanding and manually derived grammars, a very time-consuming and error-prone process. This paper describes a machine-aided grammar authoring system that enables a programmer to develop rapidly a high quality grammar for conversational systems. This is achieved with a combination of domain-specific semantics, a library gra...... hiện toàn bộ
#Natural languages #Robustness #Authoring systems #Programming profession #Libraries #Information systems #Computer errors #Writing #Law #Legal factors