Shape-based modeling of the fundamental frequency contour for emotion detection in speech

Computer Speech & Language - Tập 28 - Trang 278-294 - 2014
Juan Pablo Arias1, Carlos Busso2, Nestor Becerra Yoma1
1Speech Processing and Transmission Laboratory, Department of Electrical Engineering, Universidad de Chile, Santiago, Chile
2Multimodal Signal Processing Laboratory, The University of Texas at Dallas, Richardson, TX 75080, USA

Tài liệu tham khảo

Arias, 2010, Automatic intonation assessment for computer aided language learning, Speech Communication, 52, 254, 10.1016/j.specom.2009.11.001

Batliner, 2010, Segmenting into adequate units for automatic recognition of emotion-related episodes: a speech-based approach, Advances in Human-Computer Interaction, 2010, 1, 10.1155/2010/782802

Boersma, 1996

Busso, 2009, Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Transactions on Audio, Speech and Language Processing, 17, 582, 10.1109/TASL.2008.2009578

Cowie, 2001, Emotion recognition in human-computer interaction, IEEE Signal Processing Magazine, 18, 32, 10.1109/79.911197

El Ayadi, 2011, Survey on speech emotion recognition: Features, classification schemes, and databases, Pattern Recognition, 44, 572, 10.1016/j.patcog.2010.09.020

Eyben, 2009, openEAR-introducing the munich open-source emotion and affect recognition toolkit, 576

Grimm, 2007, Primitives-based evaluation and estimation of emotions in speech, Speech Communication, 49, 787, 10.1016/j.specom.2007.01.010

Gubian, 2009, Functional data analysis as a tool for analyzing speech dynamics. a case study on the french word c’était, 2199

Koolagudi, 2012, Emotion recognition from speech: a review, International Journal of Speech Technology, 15, 99, 10.1007/s10772-011-9125-1

Langenecker, 2005, Face emotion perception and executive functioning deficits in depression, Journal of Clinical and Experimental Neuropsychology, 27, 320, 10.1080/13803390490490515720

Lieberman, 1962, Some aspects of fundamental frequency and envelope amplitude as related to the emotional content of speech, Journal of the Acoustical Society of America, 34, 922, 10.1121/1.1918222

Nicolaou, 2011, Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space, IEEE Transactions on Affective Computing, 2, 92, 10.1109/T-AFFC.2011.9

Paeschke, 2004, Global trend of fundamental frequency in emotional speech, 671

Paeschke, 2000, Prosodic characteristics of emotional speech: Measurements of fundamental frequency movements, 75

Patterson, 1999, Pitch range modelling: Linguistic dimensions of variation, 1169

Paul, 1992, The design for the Wall Street Journal-based CSR corpus, 899

Picard, 1997

Ramsay, 2005

Russell, 1987, Relativity in the perception of emotion in facial expressions, Journal of Experimental Psych: General, 116, 223, 10.1037/0096-3445.116.3.223

Schuller, 2010, The INTERSPEECH 2010 paralinguistic challenge, 2794

Schuller, 2011, AVEC 2011- the first international audio/visual emotion challenge, 415

Taylor, 2000, Analysis and synthesis of intonation using the tilt model, Journal of the Acoustical Society of America, 107, 1697, 10.1121/1.428453

Wang, 2005, F0 contour of prosodic word in happy speech of Mandarin, 433

Zeng, 2009, A survey of affect recognition methods: audio, visual, and spontaneous expressions, IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 39, 10.1109/TPAMI.2008.52