An improved i-vector extraction algorithm for speaker verificationEURASIP Journal on Audio, Speech, and Music Processing - Tập 2015 - Trang 1-9 - 2015
Wei Li, Tianfan Fu, Jie Zhu
Over recent years, i-vector-based framework has been proven to provide
state-of-the-art performance in speaker verification. Each utterance is
projected onto a total factor space and is represented by a low-dimensional
feature vector. Channel compensation techniques are carried out in this
low-dimensional feature space. Most of the compensation techniques take the sets
of extracted i-vectors as in... hiện toàn bộ
Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognitionEURASIP Journal on Audio, Speech, and Music Processing - Tập 2021 - Trang 1-19 - 2021
Masoud Geravanchizadeh, Elnaz Forouhandeh, Meysam Bashirpour
The performance of speech recognition systems trained with neutral utterances
degrades significantly when these systems are tested with emotional speech.
Since everybody can speak emotionally in the real-world environment, it is
necessary to take account of the emotional states of speech in the performance
of the automatic speech recognition system. Limited works have been performed in
the field o... hiện toàn bộ
On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation ModellingEURASIP Journal on Audio, Speech, and Music Processing - - 2007
Annika Hämäläinen, Lou Boves, Johan de Veth, Louis ten Bosch
Recent research on the TIMIT corpus suggests that longer-length acoustic models
are more appropriate for pronunciation variation modelling than the
context-dependent phones that conventional automatic speech recognisers use.
However, the impressive speech recognition results obtained with longer-length
models on TIMIT remain to be reproduced on other corpora. To understand the
conditions in which ... hiện toàn bộ
An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproductionEURASIP Journal on Audio, Speech, and Music Processing - Tập 2022 - Trang 1-21 - 2022
Maximo Cobos, Jens Ahrens, Konrad Kowalczyk, Archontis Politis
The domain of spatial audio comprises methods for capturing, processing, and
reproducing audio content that contains spatial information. Data-based methods
are those that operate directly on the spatial information carried by audio
signals. This is in contrast to model-based methods, which impose spatial
information from, for example, metadata like the intended position of a source
onto signals t... hiện toàn bộ
Articulation constrained learning with application to speech emotion recognitionEURASIP Journal on Audio, Speech, and Music Processing - Tập 2019 - Trang 1-17 - 2019
Mohit Shah, Ming Tu, Visar Berisha, Chaitali Chakrabarti, Andreas Spanias
Speech emotion recognition methods combining articulatory information with
acoustic features have been previously shown to improve recognition performance.
Collection of articulatory data on a large scale may not be feasible in many
scenarios, thus restricting the scope and applicability of such methods. In this
paper, a discriminative learning method for emotion recognition using both
articulator... hiện toàn bộ
Adaptive V/UV Speech Detection Based on Characterization of Background NoiseEURASIP Journal on Audio, Speech, and Music Processing - Tập 2009 - Trang 1-12 - 2009
F Beritelli, S Casale, A Russo, S Serrano
The paper presents an adaptive system for Voiced/Unvoiced (V/UV) speech
detection in the presence of background noise. Genetic algorithms were used to
select the features that offer the best V/UV detection according to the output
of a background Noise Classifier (NC) and a Signal-to-Noise Ratio Estimation
(SNRE) system. The system was implemented, and the tests performed using the
TIMIT speech cor... hiện toàn bộ
Robust image-in-audio watermarking technique based on DCT-SVD transformEURASIP Journal on Audio, Speech, and Music Processing - Tập 2018 - Trang 1-12 - 2018
Aniruddha Kanhe, Aghila Gnanasekaran
In this paper, a robust and highly imperceptible audio watermarking technique is
presented based on discrete cosine transform (DCT) and singular value
decomposition (SVD). The low-frequency components of the audio signal have been
selectively embedded with watermark image data making the watermarked audio
highly imperceptible and robust. The imperceptibility of proposed methods is
evaluated by com... hiện toàn bộ
Depression-level assessment from multi-lingual conversational speech data using acoustic and text featuresEURASIP Journal on Audio, Speech, and Music Processing - Tập 2020 - Trang 1-17 - 2020
Cenk Demiroglu, Aslı Beşirli, Yasin Ozkanca, Selime Çelik
Depression is a widespread mental health problem around the world with a
significant burden on economies. Its early diagnosis and treatment are critical
to reduce the costs and even save lives. One key aspect to achieve that goal is
to use technology and monitor depression remotely and relatively inexpensively
using automated agents. There has been numerous efforts to automatically assess
depressi... hiện toàn bộ
Beyond the Big Five personality traits for music recommendation systemsEURASIP Journal on Audio, Speech, and Music Processing - Tập 2023 - Trang 1-17 - 2023
Mariusz Kleć, Alicja Wieczorkowska, Krzysztof Szklanny, Włodzimierz Strus
The aim of this paper is to investigate the influence of personality traits,
characterized by the BFI (Big Five Inventory) and its significant revision
called BFI-2, on music recommendation error. The BFI-2 describes the lower-order
facets of the Big Five personality traits. We performed experiments with 279
participants, using an application (called Music Master) we developed for music
listening ... hiện toàn bộ
Speech emotion recognition based on emotion perceptionEURASIP Journal on Audio, Speech, and Music Processing - Tập 2023 - Trang 1-7 - 2023
Gang Liu, Shifang Cai, Ce Wang
Speech emotion recognition (SER) is a hot topic in speech signal processing.
With the advanced development of the cheap computing power and proliferation of
research in data-driven methods, deep learning approaches are prominent
solutions to SER nowadays. SER is a challenging task due to the scarcity of
datasets and the lack of emotion perception. Most existing networks of SER are
based on compute... hiện toàn bộ