EURASIP Journal on Audio, Speech, and Music Processing

Công bố khoa học tiêu biểu

* Dữ liệu chỉ mang tính chất tham khảo

Sắp xếp:  
An improved i-vector extraction algorithm for speaker verification
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2015 - Trang 1-9 - 2015
Wei Li, Tianfan Fu, Jie Zhu
Over recent years, i-vector-based framework has been proven to provide state-of-the-art performance in speaker verification. Each utterance is projected onto a total factor space and is represented by a low-dimensional feature vector. Channel compensation techniques are carried out in this low-dimensional feature space. Most of the compensation techniques take the sets of extracted i-vectors as in...... hiện toàn bộ
Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2021 - Trang 1-19 - 2021
Masoud Geravanchizadeh, Elnaz Forouhandeh, Meysam Bashirpour
The performance of speech recognition systems trained with neutral utterances degrades significantly when these systems are tested with emotional speech. Since everybody can speak emotionally in the real-world environment, it is necessary to take account of the emotional states of speech in the performance of the automatic speech recognition system. Limited works have been performed in the field o...... hiện toàn bộ
On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling
EURASIP Journal on Audio, Speech, and Music Processing - - 2007
Annika Hämäläinen, Lou Boves, Johan de Veth, Louis ten Bosch
Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. However, the impressive speech recognition results obtained with longer-length models on TIMIT remain to be reproduced on other corpora. To understand the conditions in which ...... hiện toàn bộ
An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2022 - Trang 1-21 - 2022
Maximo Cobos, Jens Ahrens, Konrad Kowalczyk, Archontis Politis
The domain of spatial audio comprises methods for capturing, processing, and reproducing audio content that contains spatial information. Data-based methods are those that operate directly on the spatial information carried by audio signals. This is in contrast to model-based methods, which impose spatial information from, for example, metadata like the intended position of a source onto signals t...... hiện toàn bộ
Articulation constrained learning with application to speech emotion recognition
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2019 - Trang 1-17 - 2019
Mohit Shah, Ming Tu, Visar Berisha, Chaitali Chakrabarti, Andreas Spanias
Speech emotion recognition methods combining articulatory information with acoustic features have been previously shown to improve recognition performance. Collection of articulatory data on a large scale may not be feasible in many scenarios, thus restricting the scope and applicability of such methods. In this paper, a discriminative learning method for emotion recognition using both articulator...... hiện toàn bộ
Adaptive V/UV Speech Detection Based on Characterization of Background Noise
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2009 - Trang 1-12 - 2009
F Beritelli, S Casale, A Russo, S Serrano
The paper presents an adaptive system for Voiced/Unvoiced (V/UV) speech detection in the presence of background noise. Genetic algorithms were used to select the features that offer the best V/UV detection according to the output of a background Noise Classifier (NC) and a Signal-to-Noise Ratio Estimation (SNRE) system. The system was implemented, and the tests performed using the TIMIT speech cor...... hiện toàn bộ
Robust image-in-audio watermarking technique based on DCT-SVD transform
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2018 - Trang 1-12 - 2018
Aniruddha Kanhe, Aghila Gnanasekaran
In this paper, a robust and highly imperceptible audio watermarking technique is presented based on discrete cosine transform (DCT) and singular value decomposition (SVD). The low-frequency components of the audio signal have been selectively embedded with watermark image data making the watermarked audio highly imperceptible and robust. The imperceptibility of proposed methods is evaluated by com...... hiện toàn bộ
Depression-level assessment from multi-lingual conversational speech data using acoustic and text features
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2020 - Trang 1-17 - 2020
Cenk Demiroglu, Aslı Beşirli, Yasin Ozkanca, Selime Çelik
Depression is a widespread mental health problem around the world with a significant burden on economies. Its early diagnosis and treatment are critical to reduce the costs and even save lives. One key aspect to achieve that goal is to use technology and monitor depression remotely and relatively inexpensively using automated agents. There has been numerous efforts to automatically assess depressi...... hiện toàn bộ
Beyond the Big Five personality traits for music recommendation systems
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2023 - Trang 1-17 - 2023
Mariusz Kleć, Alicja Wieczorkowska, Krzysztof Szklanny, Włodzimierz Strus
The aim of this paper is to investigate the influence of personality traits, characterized by the BFI (Big Five Inventory) and its significant revision called BFI-2, on music recommendation error. The BFI-2 describes the lower-order facets of the Big Five personality traits. We performed experiments with 279 participants, using an application (called Music Master) we developed for music listening ...... hiện toàn bộ
Speech emotion recognition based on emotion perception
EURASIP Journal on Audio, Speech, and Music Processing - Tập 2023 - Trang 1-7 - 2023
Gang Liu, Shifang Cai, Ce Wang
Speech emotion recognition (SER) is a hot topic in speech signal processing. With the advanced development of the cheap computing power and proliferation of research in data-driven methods, deep learning approaches are prominent solutions to SER nowadays. SER is a challenging task due to the scarcity of datasets and the lack of emotion perception. Most existing networks of SER are based on compute...... hiện toàn bộ
Tổng số: 331   
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 10