Survey on speech emotion recognition: Features, classification schemes, and databases

Pattern Recognition - Tập 44 Số 3 - Trang 572-587 - 2011
Moataz El Ayadi1, Mohamed S. Kamel2, Fakhri Karray2
1Engineering Mathematics and Physics, Cairo University, Giza 12613, Egypt
2Electrical and Computer Engineering, University of Waterloo, 200 University Avenue W., Waterloo, Ontario, Canada N2L 1V9#TAB#

Tóm tắt

Từ khóa


Tài liệu tham khảo

Akaike, 1974, A new look at the statistical model identification, IEEE Trans. Autom. Control, 19, 716, 10.1109/TAC.1974.1100705

N. Amir, S. Ron, N. Laor, Analysis of an emotional speech corpus in Hebrew based on objective criteria, in: SpeechEmotion-2000, 2000, pp. 29–33.

J. Ang, R. Dhillon, A. Krupski, E. Shriberg, A. Stolcke, Prosody-based automatic detection of annoyance and frustration in human–computer dialog, in: Proceedings of the ICSLP 2002, 2002, pp. 2037–2040.

Atal, 1974, Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification, J. Acoust. Soc. Am., 55, 1304, 10.1121/1.1914702

Athanaselis, 2005, Asr for emotional speech: clarifying the issues and enhancing the performance, Neural Networks, 18, 437, 10.1016/j.neunet.2005.03.008

M.M.H. El Ayadi, M.S. Kamel, F. Karray, Speech emotion recognition using Gaussian mixture vector autoregressive models, in: ICASSP 2007, vol. 4, 2007, pp. 957–960.

Banse, 1996, Acoustic profiles in vocal emotion expression, J. Pers. Soc. Psychol., 70, 614, 10.1037/0022-3514.70.3.614

A. Batliner, K. Fischer, R. Huber, J. Spiker, E. Noth, Desperately seeking emotions: actors, wizards and human beings, in: Proceedings of the ISCA Workshop Speech Emotion, 2000, pp. 195–200.

Beeke, 2009, Prosody as a compensatory strategy in the conversations of people with agrammatism, Clin. Linguist. Phonetics, 23, 133, 10.1080/02699200802602985

Bishop, 1995

M. Borchert, A. Dusterhoft, Emotions in speech—experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments, in: Proceedings of 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE’05 2005, 2005, pp. 147–151.

Bosch, 2003, Emotions, speech and the asr framework, Speech Commun., 40, 213, 10.1016/S0167-6393(02)00083-3

Bou-Ghazale, 2000, A comparative study of traditional and newly proposed features for recognition of speech under stress, IEEE Trans. Speech Audio Process., 8, 429, 10.1109/89.848224

Le Bouquin, 1996, Enhancement of noisy speech signals: application to mobile radio communications, Speech Commun., 18, 3, 10.1016/0167-6393(95)00021-6

Breazeal, 2002, Recognition of affective communicative intent in robot-directed speech, Autonomous Robots, 2, 83, 10.1023/A:1013215010749

Breiman, 1996, Bagging predictors, Mach. Learn., 24, 123, 10.1007/BF00058655

Burges, 1998, A tutorial on support vector machines for pattern recognition, Data Mining Knowl. Discovery, 2, 121, 10.1023/A:1009715923555

F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss, A database of German emotional speech, in: Proceedings of the Interspeech 2005, Lissabon, Portugal, 2005, pp. 1517–1520.

Busso, 2009, Analysis of emotionally salient aspects of fundamental frequency for emotion detection, IEEE Trans. Audio Speech Language Process., 17, 582, 10.1109/TASL.2008.2009578

Cahn, 1990, The generation of affect in synthesized speech, J. Am. Voice Input/Output Soc., 8, 1

Caims, 1994, Nonlinear analysis and detection of speech under stressed conditions, J. Acoust. Soc. Am., 96, 3392, 10.1121/1.410601

W. Campbell, Databases of emotional speech, in: Proceedings of the ISCA (International Speech Communication and Association) ITRW on Speech and Emotion, 2000, pp. 34–38.

C. Chen, M. You, M. Song, J. Bu, J. Liu, An enhanced speech emotion recognition system based on discourse information, in: Lecture Notes in Computer Science—I (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3991, 2006, pp. 449–456, cited by (since 1996) 1.

L. Chen, T. Huang, T. Miyasato, R. Nakatsu, Multimodal human emotion/expression recognition, in: Proceedings of the IEEE Automatic Face and Gesture Recognition, 1998, pp. 366–371.

Z. Chuang, C. Wu, Emotion recognition using acoustic features and textual content, Multimedia and Expo, 2004. IEEE International Conference on ICME ’04, vol. 1, 2004, pp. 53–56.

R. Cohen, A computational theory of the function of clue words in argument understanding, in: ACL-22: Proceedings of the 10th International Conference on Computational Linguistics and 22nd Annual Meeting on Association for Computational Linguistics, 1984, pp. 251–258.

Cowie, 2003, Describing the emotional states that are expressed in speech, Speech Commun., 40, 5, 10.1016/S0167-6393(02)00071-7

R. Cowie, E. Douglas-Cowie, Automatic statistical analysis of the signal and prosodic signs of emotion in speech, in: Proceedings, Fourth International Conference on Spoken Language, 1996. ICSLP 96. vol. 3, 1996, pp. 1989–1992.

Cowie, 2001, Emotion recognition in human–computer interaction, IEEE Signal Process. Mag., 18, 32, 10.1109/79.911197

Cristianini, 2000

Davitz, 1964

Dempster, 1977, Maximum likelihood from incomplete data via the em algorithm, J. R. Stat. Soc., 39, 1

L. Devillers, L. Lamel, Emotion detection in task-oriented dialogs, in: Proceedings of the International Conference on Multimedia and Expo 2003, 2003, pp. 549–552.

Duda, 2001

Edwards, 1999, Emotion discourse, Culture Psychol., 5, 271, 10.1177/1354067X9953001

Ekman, 1982

Abu El-Yazeed, 2004, On the determination of optimal model order for gmm-based text-independent speaker identification, EURASIP J. Appl. Signal Process., 8, 1078

I. Engberg, A. Hansen, Documentation of the Danish emotional speech database des 〈http://cpk.auc.dk/tb/speech/Emotions/〉, 1996.

Ephraim, 2002, Hidden Markov processes, IEEE Trans. Inf. Theory, 48, 1518, 10.1109/TIT.2002.1003838

R. Fernandez, A computational model for the automatic recognition of affect in speech, Ph.D. Thesis, Massachusetts Institute of Technology, February 2004.

France, 2000, Acoustical properties of speech as indicators of depression and suicidal risk, IEEE Trans. Biomedical Eng., 47, 829, 10.1109/10.846676

Freund, 1997, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci., 55, 119, 10.1006/jcss.1997.1504

L. Fu, X. Mao, L. Chen, Speaker independent emotion recognition based on svm/hmms fusion system, in: International Conference on Audio, Language and Image Processing, 2008. ICALIP 2008, pp. 61–65.

Gelfer, 1995, Comparisons of jitter, shimmer, and signal-to-noise ratio from directly digitized versus taped voice samples, J. Voice, 9, 378, 10.1016/S0892-1997(05)80199-7

H. Go, K. Kwak, D. Lee, M. Chun, Emotion recognition from the facial image and speech signal, in: Proceedings of the IEEE SICE 2003, vol. 3, 2003, pp. 2890–2895.

Gobl, 2003, The role of voice quality in communicating emotion, mood and attitude, Speech Commun., 40, 189, 10.1016/S0167-6393(02)00082-1

Gorin, 1995, On automated language acquisition, J. Acoust. Soc. Am., 97, 3441, 10.1121/1.412431

Grosz, 1986, Attention, intentions, and the structure of discourse, Comput. Linguist., 12, 175

Hansen, 1995, Icarus: source generator based real-time recognition of speech in noisy stressful and Lombard effect environments, Speech Commun., 16, 391, 10.1016/0167-6393(95)00007-B

Hernando, 1997, Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition, IEEE Trans. Speech Audio Process., 5, 80, 10.1109/89.554273

K. Hirose, H. Fujisaki, M. Yamaguchi, Synthesis by rule of voice fundamental frequency contours of spoken Japanese from linguistic information, in: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP ’84, vol. 9, 1984, pp. 597–600.

Ho, 1994, Decision combination in multiple classifier systems, IEEE Trans. Pattern Anal. Mach. Intell., 16, 66, 10.1109/34.273716

Hozjan, 2003, Context-independent multilingual emotion recognition from speech signal, Int. J. Speech Technol., 6, 311, 10.1023/A:1023426522496

V. Hozjan, Z. Moreno, A. Bonafonte, A. Nogueiras, Interface databases: design and collection of a multilingual emotional speech database, in: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC’02) Las Palmas de Gran Canaria, Spain, 2002, pp. 2019–2023.

H. Hu, M. Xu, W. Wu, Dimensions of emotional meaning in speech, in: Proceedings of the ISCA ITRW on Speech and Emotion, 2000, pp. 25–28.

H. Hu, M. Xu, W. Wu, Gmm supervector based svm with spectral features for speech emotion recognition, in: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007, vol. 4, 2007, pp. IV 413–IV 416.

H. Hu, M.-X. Xu, W. Wu, Fusion of global statistical and segmental spectral features for speech emotion recognition, in: International Speech Communication Association—8th Annual Conference of the International Speech Communication Association, Interspeech 2007, vol. 2, 2007, pp. 1013–1016.

Jain, 2000, Statistical pattern recognition: a review, IEEE Trans. Pattern Anal. Mach. Intell., 22, 4, 10.1109/34.824819

Johnstone, 2005, Affective speech elicited with a computer game, Emotion, 5, 513, 10.1037/1528-3542.5.4.513

Johnstone, 2000

Deller, 1993

Kleinginna, 1981, A categorized list of emotion definitions, with suggestions for a consensual definition, Motivation Emotion, 5, 345, 10.1007/BF00992553

J. Kaiser, On a simple algorithm to calculate the ‘energy’ of the signal, in: ICASSP-90, 1990, pp. 381–384.

Kaiser, 1962, Communication of affects by single vowels, Synthese, 14, 300, 10.1007/BF00869311

E. Kim, K. Hyun, S. Kim, Y. Kwak, Speech emotion recognition using eigen-fft in clean and noisy environments, in: The 16th IEEE International Symposium on Robot and Human Interactive Communication, 2007, RO-MAN 2007, 2007, pp. 689–694.

Kuncheva, 2002, A theoretical study on six classifier fusion strategies, IEEE Trans. Pattern Anal. Mach. Intell., 24, 281, 10.1109/34.982906

Kuncheva, 2004

O. Kwon, K. Chan, J. Hao, T. Lee, Emotion recognition by speech signal, in: EUROSPEECH Geneva, 2003, pp. 125–128.

Lee, 2005, Toward detecting emotions in spoken dialogs, IEEE Trans. Speech Audio Process., 13, 293, 10.1109/TSA.2004.838534

C. Lee, S. Narayanan, R. Pieraccini, Classifying emotions in human–machine spoken dialogs, in: Proceedings of the ICME’02, vol. 1, 2002, pp. 737–740.

C. Lee, S.S. Narayanan, R. Pieraccini, Classifying emotions in human–machine spoken dialogs, in: 2002 IEEE International Conference on Multimedia and Expo, 2002, ICME ’02, Proceedings, vol. 1, 2002, pp. 737–740.

C. Lee, R. Pieraccini, Combining acoustic and language information for emotion recognition, in: Proceedings of the ICSLP 2002, 2002, pp. 873–876.

C. Lee, S. Yildrim, M. Bulut, A. Kazemzadeh, C. Busso, Z. Deng, S. Lee, S. Narayanan, Emotion recognition based on phoneme classes, in: Proceedings of ICSLP, 2004, pp. 2193–2196.

Leinonen, 1997, Expression of emotional-motivational connotations with a one-word utterance, J. Acoust. Soc. Am., 102, 1853, 10.1121/1.420109

Leinonen, 1997, Expression of emotional-motivational connotations with a one-word utterance, J. Acoust. Soc. Am., 102, 1853, 10.1121/1.420109

X. Li, J. Tao, M.T. Johnson, J. Soltis, A. Savage, K.M. Leong, J.D. Newman, Stress and emotion classification using jitter and shimmer features, in: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007, vol. 4, April 2007, pp. IV-1081–IV-1084.

Lien, 2002, Detection, tracking and classification of action units in facial expression, J. Robotics Autonomous Syst., 31, 131, 10.1016/S0921-8890(99)00103-7

University of Pennsylvania Linguistic Data Consortium, Emotional prosody speech and transcripts 〈http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002S28〉, July 2002.

J. Liscombe, Prosody and speaker state: paralinguistics, pragmatics, and proficiency, Ph.D. Thesis, Columbia University, 2007.

D.G. Lowe, Object recognition from local scale-invariant features, in: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, 1999, pp. 1150–1157.

M. Lugger, B. Yang, The relevance of voice quality features in speaker independent emotion recognition, in: icassp, vol. 4, 2007, pp. 17–20.

M. Lugger, B. Yang, The relevance of voice quality features in speaker independent emotion recognition, in: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, ICASSP 2007, vol. 4, April 2007, pp. IV-17–IV-20.

M. Lugger, B. Yang, Psychological motivated multi-stage emotion classification exploiting voice quality features, in: F. Mihelic, J. Zibert (Eds.), Speech Recognition, In-Tech, 2008.

M. Lugger, B. Yang, Combining classifiers with diverse feature sets for robust speaker independent emotion recognition, in: Proceedings of EUSIPCO, 2009.

M. Lugger, B. Yang, W. Wokurek, Robust estimation of voice quality parameters under realworld disturbances, in: 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006, ICASSP 2006 Proceedings, vol. 1, May 2006, pp. I–I.

J. Ma, H. Jin, L. Yang, J. Tsai, in: Ubiquitous Intelligence and Computing: Third International Conference, UIC 2006, Wuhan, China, September 3–6, 2006, Proceedings (Lecture Notes in Computer Science), Springer-Verlag, New York, Inc., Secaucus, NJ, USA, 2006.

Markel, 1976

Mashao, 2006, Combining classifier decisions for robust speaker identification, Pattern Recognition, 39, 147, 10.1016/j.patcog.2005.08.004

Mesot, 2007, Switching linear dynamical systems for noise robust speech recognition, IEEE Trans. Audio Speech Language Process., 15, 1850, 10.1109/TASL.2007.901312

Mitra, 2002, Unsupervised feature selection using feature similarity, IEEE Trans. Pattern Anal. Mach. Intell., 24, 301, 10.1109/34.990133

Morrison, 2007, Ensemble methods for spoken emotion recognition in call-centres, Speech Commun., 49, 98, 10.1016/j.specom.2006.11.004

Murray, 1993, Toward a simulation of emotions in synthetic speech: A review of the literature on human vocal emotion, J. Acoust. Soc. Am., 93, 1097, 10.1121/1.405558

Nicholson, 2000, Emotion recognition in speech using neural networks, Neural Comput. Appl., 9, 290, 10.1007/s005210070006

Nwe, 2003, Speech emotion recognition using hidden Markov models, Speech Commun., 41, 603, 10.1016/S0167-6393(03)00099-2

O’Connor, 1973

A. Oster, A. Risberg, The identification of the mood of a speaker by hearing impaired listeners, Speech Transmission Lab. Quarterly Progress Status Report 4, Stockholm, 1986, pp. 79–90.

T. Otsuka, J. Ohya, Recognizing multiple persons’ facial expressions using hmm based on automatic extraction of significant frames from image sequences, in: Proceedings of the International Conference on Image Processing (ICIP-97), 1997, pp. 546–549.

T.L. Pao, Y.-T. Chen, J.-H. Yeh, W.-Y. Liao, Combining acoustic features for improved emotion recognition in Mandarin speech, in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3784, 2005, pp. 279–285, cited by (since 1996) 1.

V. Petrushin, Emotion recognition in speech signal: experimental study, development and application, in: Proceedings of the ICSLP 2000, 2000, pp. 222–225.

Picard, 2001, Toward machine emotional intelligence: analysis of affective physiological state, IEEE Trans. Pattern Anal. Mach. Intell., 23, 1175, 10.1109/34.954607

Pierre-Yves, 2003, The production and recognition of emotions in speech: features and algorithms, Int. J. Human–Computer Stud., 59, 157, 10.1016/S1071-5819(02)00141-6

Rabiner, 1986, An introduction to hidden Markov models, IEEE ASSP Mag., 3, 4, 10.1109/MASSP.1986.1165342

Rabiner, 1993

Rabiner, 1978

A. Razak, R. Komiya, M. Abidin, Comparison between fuzzy and nn method for speech emotion recognition, in: 3rd International Conference on Information Technology and Applications ICITA 2005, vol. 1, 2005, pp. 297–302.

Reynolds, 2000, Speaker verification using adapted Gaussian mixture models, Digital Signal Process., 10, 19, 10.1006/dspr.1999.0361

Reynolds, 1995, Robust text-independent speaker identification using Gaussian mixture speaker models, IEEE Trans. Speech Audio Process., 3, 72, 10.1109/89.365379

Rissanen, 1978, Modeling by shortest data description, Automatica, 14, 465, 10.1016/0005-1098(78)90005-5

Scherer, 1986, Vocal affect expression. A review and a model for future research, Psychological Bull., 99, 143, 10.1037/0033-2909.99.2.143

Schlosberg, 1954, Three dimensions of emotion, Psychological Rev., 61, 81, 10.1037/h0054570

M. Schubiger, English intonation: its form and function, Niemeyer, Tubingen, Germany, 1958.

B. Schuller, Towards intuitive speech interaction by the integration of emotional aspects, in: 2002 IEEE International Conference on Systems, Man and Cybernetics, vol. 6, 2002, p. 6.

B. Schuller, M. Lang, G. Rigoll, Robust acoustic speech emotion recognition by ensembles of classifiers, in: Proceedings of the DAGA’05, 31, Deutsche Jahrestagung für Akustik, DEGA, 2005, pp. 329–330.

B. Schuller, S. Reiter, R. Muller, M. Al-Hames, M. Lang, G. Rigoll, Speaker independent speech emotion recognition by ensemble classification, in: IEEE International Conference on Multimedia and Expo, 2005. ICME 2005, 2005, pp. 864–867.

B. Schuller, G. Rigoll, M. Lang, Hidden Markov model-based speech emotion recognition, in: International Conference on Multimedia and Expo (ICME), vol. 1, 2003, pp. 401–404.

B. Schuller, G. Rigoll, M. Lang, Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture, in: Proceedings of the ICASSP 2004, vol. 1, 2004, pp. 577–580.

M.T. Shami, M.S. Kamel, Segment-based approach to the recognition of emotions in speech, in: IEEE International Conference on Multimedia and Expo, 2005. ICME 2005, 2005, 4pp.

L.C. De Silva, T. Miyasato, R. Nakatsu, Facial emotion recognition using multimodal information, in: Proceedings of the IEEE International Conference on Information, Communications and Signal Processing (ICICS’97), 1997, pp. 397–401.

L.C. De Silva, T. Miyasato, R. Nakatsu, Facial emotion recognition using multi-modal information, in: Proceedings of 1997 International Conference on Information, Communications and Signal Processing, 1997, ICICS, vol. 1, September 1997, pp. 397–401.

Slaney, 2003, Babyears: a recognition system for affective vocalizations, Speech Commun., 39, 367, 10.1016/S0167-6393(02)00049-3

Stevens, 1994, Classification of glottal vibration from acoustic measurements, Vocal Fold Physiol., 147

R. Sun, E. Moore, J.F. Torres, Investigating glottal parameters for differentiating emotional categories with similar prosodics, in: IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009, April 2009, pp. 4509–4512.

Tao, 2006, Prosody conversion from neutral speech to emotional speech, IEEE Trans. Audio Speech Language Process., 14, 1145, 10.1109/TASL.2006.876113

Teager, 1990, Some observations on oral air flow during phonation, IEEE Trans. Acoust. Speech Signal Process., 28, 599, 10.1109/TASSP.1980.1163453

H. Teager, S. Teager, Evidence for nonlinear production mechanisms in the vocal tract, in: Speech Production and Speech Modelling, Nato Advanced Institute, vol. 55, 1990, pp. 241–261.

Tsymbal, 2005, Diversity in search strategies for ensemble feature selection, Inf. Fusion, 6, 146

Tsymbal, 2003, Ensemble feature selection with the simple Bayesian classification, Inf. Fusion, 4, 146

D. Ververidis, C. Kotropoulos, Emotional speech classification using Gaussian mixture models and the sequential floating forward selection algorithm, in: IEEE International Conference on Multimedia and Expo, 2005. ICME 2005, July 2005, pp. 1500–1503.

Ververidis, 2006, Emotional speech recognition: resources, features and methods, Speech Commun., 48, 1162, 10.1016/j.specom.2006.04.003

D. Ververidis, C. Kotropoulos, I. Pitas, Automatic emotional speech classification, in: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004, Proceedings, (ICASSP ’04), vol. 1, 2004, pp. I-593-6.

Viterbi, 1967, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm Viterbi, IEEE Trans. Inf. Theory, 13, 260, 10.1109/TIT.1967.1054010

Vlassis, 1999, A kurtosis-based dynamic approach to Gaussian mixture modeling, IEEE Trans. Syst. Man Cybern., 29, 393, 10.1109/3468.769758

Vlassis, 2002, A greedy em algorithm for Gaussian mixture learning, Neural Process. Lett., 15, 77, 10.1023/A:1013844811137

Wang, 2006, A dynamic conditional random field model for foreground and shadow segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 28, 279, 10.1109/TPAMI.2006.25

Williams, 1972, Emotions and speech: some acoustical correlates, J. Acoust. Soc. Am., 52, 1238, 10.1121/1.1913238

C. Williams, K. Stevens, Vocal correlates of emotional states, Speech Evaluation in Psychiatry, Grune and Stratton, 1981, pp. 189–220.

Witten, 2000

Womack, 1999, N-channel hidden Markov models for combined stressed speech classification and recognition, IEEE Trans. Speech Audio Process., 7, 668, 10.1109/89.799692

J. Wu, M.D. Mullin, J.M. Rehg, Linear asymmetric classifier for cascade detectors, in: 22th International Conference on Machine Learning, 2005.

M. You, C. Chen, J. Bu, J. Liu, J. Tao, Getting started with susas: a speech under simulated and actual stress database, in: EUROSPEECH-97, vol. 4, 1997, pp. 1743–1746.

M. You, C. Chen, J. Bu, J. Liu, J. Tao, Emotion recognition from noisy speech, in: IEEE International Conference on Multimedia and Expo, 2006, 2006, pp. 1653–1656l.

M. You, C. Chen, J. Bu, J. Liu, J. Tao, Emotional speech analysis on nonlinear manifold, in: 18th International Conference on Pattern Recognition, 2006. ICPR 2006, vol. 3, 2006, pp. 91–94.

M. You, C. Chen, J. Bu, J. Liu, J. Tao, A hierarchical framework for speech emotion recognition, in: IEEE International Symposium on Industrial Electronics, 2006, vol. 1, 2006, pp. 515–519.

Young, 1996, Large vocabulary continuous speech recognition, IEEE Signal Process. Mag., 13, 45, 10.1109/79.536824

Zhou, 2001, Nonlinear feature based classification of speech under stress, IEEE Trans. Speech Audio Process., 9, 201, 10.1109/89.905995

J. Zhou, G. Wang, Y. Yang, P. Chen, Speech emotion recognition based on rough set and svm, in: 5th IEEE International Conference on Cognitive Informatics, 2006, ICCI 2006, vol. 1, 2006, pp. 53–61.