Classifying emotions in human-machine spoken dialogs

Chul Min Lee1, S.S. Narayanan1, R. Pieraccini2
1Department of Electrical Engineering and IMSC, University of Southern California, Los Angeles, CA, USA
2Speechworks International, Inc., NY, USA

Tóm tắt

This paper reports on the comparison between various acoustic feature sets and classification algorithms for classifying spoken utterances based on the emotional state of the speaker. The data set used for the analysis comes from a corpus of human-machine dialogs obtained from a commercial application. Emotion recognition is posed as a pattern recognition problem. We used three different techniques - linear discriminant classifier (LDC), k-nearest neighborhood (k-NN) classifier, and support vector machine classifier (SVC) -for classifying utterances into 2 emotion classes: negative and non-negative. In this study, two feature sets were used; the base feature set obtained from the utterance-level statistics of the pitch and energy of the speech, and the feature set analyzed by principal component analysis (PCA). PCA showed a performance comparable to the base feature sets. Overall, the LDC achieved the best performance with error rates of 27.54% on female data and 25.46% on males with the base feature set. The SVC, however, showed a better performance in the problem of data sparsity.

Từ khóa

#Man machine systems #Principal component analysis #Static VAr compensators #Speech analysis #Classification algorithms #Loudspeakers #Emotion recognition #Pattern recognition #Linear discriminant analysis #Support vector machines

Tài liệu tham khảo

10.1007/978-1-4757-2440-0 10.1023/A:1009715923555 arunachalam, 2001, Politeness and frustration language in child-machine interactions, Proc EUROSPEECH, 2675 duda, 2001, Pattern Classification 10.1017/CBO9780511571299 10.1109/ASRU.2001.1034632 10.1109/79.911197 mcgilloway, 2000, Approaching automatic recognition of emotion from voice: A rough benchmark, ISCA Workshop on Speech and Emotion 10.1109/ICSLP.1996.608022 batliner, 0, Desperately seeking emotions: Actors, wizards, and human beings, Proceedings of the ISCA Workshop on Speech and Emotion petrushin, 1999, Emotion in speech: Recognition and application to call centers, Artif Neu Net Engr (ANNIE), 7 scherer, 2000, A Cross-Cultural Investigation of Emotion Inferences from Voice and Speech Implications for Speech Technology 0 10.1109/AFGR.1996.557292