Perceptual audio features for emotion detection
Tóm tắt
Từ khóa
Tài liệu tham khảo
Cowie R, Douglas-Cowie E, Tsapatsoulis N, Votsis G, Kollias S, Fellenz W, Taylor J: Emotion recognition in human-computer interaction. IEEE Signal Process Mag 2001, 18(1):32-80.
Ayadia ME, Kamelb MS, Karrayb F: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit 2011, 44(3):572-587.
Lee CM, Narayanan SS: Toward detecting emotions in spoken dialogs. IEEE Trans Speech Audio Process 2005, 13: 293-303.
Gunes H, Schuller B, Pantic M, Cowie R: Emotion representation, analysis and synthesis in continuous space: a survey. Proc of the IEEE Int Workshop on EmoSPACE, in Conjunction with the IEEE FG 2011, CA, USA 2011, 827-834.
Schuller B, Vlasenko B, Eyben F, Rigoll G, Wendemuth A: Acoustic emotion recognition: a benchmark comparison of performances. Proc of the IEEE Automatic Speech Recognition and Understanding Workshop, Italy 2009, 552-557.
Ververidis D, Kotropoulos C: Emotional speech recognition: resources, features, and methods. Speech Commun 2006, 48(9):1162-1181.
Young S, Evermann G, Gales M, Hain T, Kershaw D, Liu X, Moore G, Odell J, Ollason D, Povey D, Valtchev V, Woodland P: The HTK Book (v3.4). Cambridge University Press, Cambridge; 2006.
Eyben F, Wollmer M, Schuller B: openEAR--introducing the munich open-source emotion and affect recognition toolkit. IEEE Proc of the 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction, Amsterdam 2009, 576-581.
Stuhlsatz A, Meyer C, Eyben F, Zielke T, Meier G, Schuller B: Deep neural networks for acoustic emotion recognition: raising the benchmarks. Proc of the IEEE International Conference on Acoustics Speech and Signal Processing, Prague 2011, 5688-5691.
Lugger M, Yang B: Psychological motivated multi-stage emotion classification exploiting voice quality features. In Speech Recognition, Technologies and Applications. Edited by: France Mihelic, Janez Zibert. I-Tech Education and Publishing, Vienna, Austria; 2008:395-410.
Yang B, Lugger M: Emotion recognition from speech signals using new harmony features. Signal Process 2010, 90(5):1415-1423.
Sezgin C, Gunsel B, Kurt GK: A novel perceptual feature set for audio emotion recognition. Proc of the IEEE Int Workshop on EmoSPACE, in Conjunction with the IEEE FG 2011, CA, USA 2011, 780-785.
Schuller B, Vlasenko B, Eyben F, Wollmer M, Stuhlsatz A, Wendemuth A, Rigoll G: Cross-corpus acoustic emotion recognition: variances and strategies. IEEE Trans Affect Comput 2010, 1(2):1-13.
Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B: A database of German emotional speech. Proc of the INTERSPEECH, Portugal 2005, 1517-1520.
Grimm M, Kroschel K, Narayanan S: The Vera am Mittag German audio-visual emotional speech database. Proc of the IEEE International Conference on Multimedia and Expo, Germany 2008, 737-742.
Nwe T, Foo S, De Silva L: Speech emotion recognition using hidden Markov models. Speech Commun 2003, 41: 603-623.
Zhou G, Hansen JHL, Kaiser JF: Nonlinear feature based classification of speech under stress. IEEE Trans Speech Audio Process 2001, 9(3):201-216.
Chen L, Huang T, Miyasato T, Nakatsu R: Multimodal human emotion/expression recognition. Proc of the IEEE Automatic Face and Gesture Recognition, Japan 1998, 366-371.
Pudil P, Ferri F, Novovicova J, Kittler J: Floating search method for feature selection with nonmonotonic criterion functions. Proc of the International Conference on Pattern Recognition, Israel 1994, 279-283.
International Telecommunications Union Recommendation BS.1387-1, Method for objective measurements of perceived audio quality 2000.
Thiede T, Treurniet WC, Bitto R, Schmidmer C, Sporer T, Beerends JG, Colomes C, Keyhl M, Stoll H, Brandenburg K: PEAQ--the ITU standard for objective measurement of perceived audio quality. J Audio Eng Soc 2000, 48: 3-29.
Busso C, Lee S, Narayanan S: Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Trans Audio Speech Lang Process 2009, 17(4):582-596.
Murphy PJ, McGuigan KG, Walsh M, Colreavy M: Investigation of a glottal related harmonics-to-noise ratio and spectral tilt as indicators of glottal noise in synthesized and human voice signals. Acoust Soc Am 2008, 123(3):1642-1652.
Chang CC, Lin CJ: LibSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2001, 2: 27:1-27:27.
Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques With Java Implementations. Morgan Kaufman, San Francisco; 2000.