European Language Resources Association history and recent developments - Trang 465-466
K. Choukri
This paper aims at briefly describing the rationale behind the foundation of the
European Language Resources Association (ELRA) in 1995 and its activities since
then. We would like to focus on the issues involved in making language resources
available to different sectors of the language engineering community. ELRA is
presented as a conduit for the distribution of speech, written and terminology
d... hiện toàn bộ
#History #Speech #Terminology #Databases #Natural languages #Research and development #Law #Legal factors #Logistics #Investments
Acoustic analysis and recognition of whispered speech - Trang 429-432
T. Itoh, K. Takeda, F. Itakura
The acoustic properties and a recognition method of whispered speech are
discussed. A whispered speech database that consists of whispered speech, normal
speech and the corresponding facial video images of more than 6,000 sentences
from 100 speakers was prepared. The comparison between whispered and normal
utterances show that: 1) the cepstrum distance between them is 4 dB for voiced
and 2 dB for ... hiện toàn bộ
#Speech analysis #Speech recognition #Speech processing #Hidden Markov models #Image databases #Maximum likelihood linear regression #Video recording #Loudspeakers #Cepstrum #Frequency
n-gram and decision tree based language identification for written words - Trang 335-338
J. Hakkinen, Jilei Tian
As the demand for multilingual speech recognizers increases, the development of
systems which combine automatic language identification, language-specific
pronunciation modeling and language-independent acoustic models becomes
increasingly important. When the recognition grammar is dynamic and obtained
directly from written text, the language associated with each grammar item has
to be identified ... hiện toàn bộ
#Decision trees #Natural languages #Speech recognition #Mobile handsets #Automatic speech recognition #Testing #Vocabulary #Usability #Signal processing #Embedded computing
Histogram based normalization in the acoustic feature space - Trang 21-24
S. Molau, M. Pitz, H. Ney
We describe a technique called histogram normalization that aims at normalizing
feature space distributions at different stages in the signal analysis
front-end, namely the log-compressed filterbank vectors, cepstrum coefficients,
and LDA (local density approximation) transformed acoustic vectors. Best results
are obtained at the filterbank, and in most cases there is a minor additional
gain when ... hiện toàn bộ
#Histograms #Filter bank #Signal analysis #Cepstrum #Linear discriminant analysis #Target recognition #Smoothing methods #Training data #Speech recognition #Error analysis
Construction of model-space constraints - Trang 69-72
P. Nguyen, L. Rigazio, C. Wellekens, J.-C. Junqua
HMM systems exhibit a large amount of redundancy. To this end, a technique
called eigenvoices was found to be very effective for speaker adaptation. The
correlation between HMM parameters is exploited via a linear constraint called
eigenspace. This constraint is obtained through a PCA of the training speakers.
We show how PCA can be linked to the maximum-likelihood criterion. Then, we
extend the m... hiện toàn bộ
#Maximum likelihood linear regression #Covariance matrix #Maximum likelihood estimation #Hidden Markov models #Principal component analysis #Speech #Linear discriminant analysis #Piecewise linear techniques #Gaussian processes #Vocabulary
Introduction of speech interface for mobile information services - Trang 462-463
H. Nakano
Popular Japanese mobile Web-phones are widely used to connect to Internet
providers (IP). The most popular service on mobile Web-phones is E-mail.
Currently, users type the messages using the ten standard keys on the phone.
Several letters and Kana (Japanese phonetic characters) are assigned to each
key, and the user steps through them by tapping the key repeatedly. After
inputting several words, ... hiện toàn bộ
#Automatic speech recognition #Working environment noise #Background noise #Electronic mail #Speech recognition #Cellular phones #Switches #Laboratories #Web and internet services #Lapping
Language modeling for multi-domain speech-driven text retrieval - Trang 327-330
K. Itou, A. Fujii, T. Ishikawa
We report experimental results associated with speech-driven text retrieval,
which facilitates retrieving information in multiple domains with spoken
queries. Since users speak contents related to a target collection, we produce
language models used for speech recognition based on the target collection, so
as to improve both the recognition and retrieval accuracy. Experiments using
existing test c... hiện toàn bộ
#Natural languages #Speech recognition #Information retrieval #Automatic speech recognition #Testing #Decoding #Libraries #Information science #Target recognition #Content based retrieval
Bridging the gap between mixed-initiative dialogs and reusable sub-dialogs - Trang 276-279
S. Kronenberg, P. Regel-Brietzman
For easing the development process for dialog systems it is desired that
reusable dialog components provide pre-packaged functionality 'out-of-the-box'
that enables developers to quickly build applications by providing standard
default settings and behavior. Additionally, human-computer interaction should
become more human-like in that mixed-initiative dialogs are supported.
Mixed-initiative inter... hiện toàn bộ
#Application software #Banking #Speech processing #Standards development #Expert systems #Vocabulary #Navigation
Recognition experiments with the SpeechDat-Car Aurora Spanish database using 8 kHz- and 16 kHz-sampled signals - Trang 135-138
C. Nadeu, M. Tolos
Like the other SpeechDat-Car databases, the Spanish one has been collected using
a 16 kHz sampling frequency, and several microphone positions and environmental
noises. We aim at clarifying whether there is any advantage in terms of
recognition performance from processing the 16 kHz-sampled signals instead of
the usual 8 kHz-sampled ones. Recognition tests have been carried out within the
Aurora e... hiện toàn bộ
#Databases #Microphones #Frequency #Working environment noise #Testing #Sampling methods #Bandwidth #Speech recognition #Telecommunication standards #Standards development
Smoothed language model incorporation for efficient time-synchronous beam search decoding in LVCSR - Trang 178-181
D. Willett, E. McDermott, S. Katagiri
For performing the decoding search in large vocabulary continuous speech
recognition (LVCSR) with hidden Markov models (HMM) and statistical language
models, the most straightforward and popular approach is the time-synchronous
beam search procedure. A drawback of this approach is that the time-asynchrony
of the language model weight application during search leads to performance
degradations. Thi... hiện toàn bộ
#Decoding #Hidden Markov models #Acoustic beams #Degradation #Smoothing methods #Viterbi algorithm #Context modeling #Laboratories #Speech recognition #Natural languages