Vocal emotion recognition in five native languages of Assam using new wavelet features
Tóm tắt
Từ khóa
Tài liệu tham khảo
Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time-scale adaptation. Speech Communication, 48, 1620–1637.
Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614–636.
Borden, G. J., Harris, K. S., & Raphael, L. J. (1994). Speech science primer: Physiology, acoustics and perception of speech (3rd ed.). Baltimore: Williams and Wilkins.
Boruah, B. K. (2003). Asamar Bhasa. Dibrugarh: Banalata.
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine, 18(1), 32–80.
Darwin, C. (1872/1965). The expression of the emotions in man and animals. Chicago: Chicago University Press.
Davis, S. B., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Audio Speech and Signal Processing, 28(4), 357–365.
Ekman, P. (1999). Basic emotions. In T. Dalgleish & M. Power (Eds.), Handbook of cognition and emotion. London: Wiley. Chap. 3.
Farooq, O., & Datta, S. (2001). Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Processing Letters, 8(7), 196–198.
Fukunaga, K. (1990). Introduction to statistical pattern recognition (2nd ed.). New York: Morgan Kaufmann, Academic Press.
Furui, S. (1989). Digital speech processing, synthesis and recognition. New York: Dekker.
Goswami, G. C., & Tamuli, J. (2003). Asamiya. In G. Cardona & D. Jain (Eds.), Routledge language family series : Vol. 2. The Indo-Aryan languages (pp. 391–404). London: Routledge.
Hammond, K. R., & Stewart, T. R. (2001). The essential Brunswik—beginnings, explications and applications. Oxford: Oxford University Press.
Holmes, J., & Holmes, W. (2001). Speech synthesis and recognition (2nd ed.). New York: Taylor & Francis.
Hui, G., Shanguang, C., & Guangchuan, S. (2007). Emotion classification of Mandarin speech based on TEO nonlinear features. In Proc. IEEE 8th ACIS int. conf. SNPD (Vol. 3, pp. 394–398).
Jacquesson, F. (2008). A Dimasa grammar. Internet.
Jurafsky, D., & Martin, J. H. (2000). Speech and language processing. Englewood Cliffs: Prentice-Hall.
Juslin, P. N., & Lauka, P. (2003). Communication of emotions in vocal expression and music performance. Psychological Bulletin, 129(5), 770–814.
Kaiser, J. F. (1990a). On a simple algorithm to calculate the ‘energy’ of a signal. In Proc. IEEE int. conf. acoustics. speech. and signal processing (Vol. 1, pp. 381–384), Albuquerque, NM.
Kaiser, J. F. (1990b). On Teager’s energy algorithm and its generalization to continuous signals. In Proc. 4th IEEE digital signal processing workshop, Mohonk (New Paltz), NY.
Kakati, B. (1995). Assamese, its formation and development. Guwahati: LBS Publications.
Kandali, A. B., Routray, A., & Basu, T. K. (2008a). Emotion recognition from speeches of some native languages of Assam independent of text and speaker. National Seminar on Devices, Circuits and Communication, Department of E.C.E., B.I.T. Mesra, Ranchi, Jharkhand, India, 6–7 Nov.
Kandali, A. B., Routray, A., & Basu, T. K. (2008b). Emotion recognition from Assamese speeches using MFCC features and GMM classifier. In Proc. IEEE region 10 conference TENCON 2008, 19–21 Nov., Hyderabad, India (pp. 1–5).
LaPolla, R. J., & Thurgood, G. (Eds.) (2002). Routledge language family series. The Sino-Tibetan languages. London: Routledge.
Laukka, P. (2004). Vocal expression of emotion—discrete-emotion and dimensional accounts. Comprehensive Summaries of Uppsala Dissertations from the Faculty of Social Sciences 141, ACTA Universitatis Upsaliensis, Uppsala. Experiments.
Linde, Y., Buzo, A., & Gray, R. M. (1980). An algorithm for vector quantizer design. IEEE Transactions on Communications, 28(1), 84–95.
Mallat, S. (2006). A wavelet tour of signal processing (2nd ed.). New Delhi: Academic Press, Elsevier.
Murray, I. R., & Arnott, J. L. (1993). Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. The Journal of the Acoustic Society of America, 93(2), 1097–1108.
Murray, I. R., & Arnott, J. L. (1995). Implementation and testing of a system for producing emotion-by-rule in synthetic speech. Speech Communication, 16, 369–390.
New, T. L., Foo, S. W., & Silva, L. C. D. (2003). Speech emotion recognition using hidden Markov models. Speech Communication, 41, 603–623.
Oatley, K., & Johnson-Laird, P. N. (1987). Towards a cognitive theory of emotions. Cognition and Emotion, 1, 29–50.
Pathak, R. (2008). Asomiya Bhasar Itihas. Guwahati: Ashok Book Stall.
Patil, H. A., Dutta, P. K., & Basu, T. K. (2006). The wavelet packet based cepstral features for open set speaker classification in Marathi. In M. Spiliopoulou et al. (Eds.), Studies in classification, data analysis, and knowledge organization (pp. 134–141). Berlin: Springer.
Picard, R. W. (1997). Affective computing. Cambridge: MIT Press.
Plutchik, R. (1994). The psychology and biology of emotion. New York: Harper Collins.
Power, M., & Dalgleish, T. (2008). Cognition and emotion—from order to disorder. Hove: Psychology Press.
Quatieri, T. F. (2002). Discrete time speech signal processing. Upper Saddle River: Prentice-Hall.
Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall.
Ramamohan, S., & Dandapat, S. (2006). Sinusoidal model-based analysis and classification of stressed speech. IEEE Transactions on Audio, Speech and Language Processing, 14(3), 737–746.
Razak, A. A., Isa, A. H. M., & Komiya, R. (2004). A neural network approach for emotion recognition in speech. In Proc. 2nd int. conf. art. intell. in engineering & technology, Kota Kinabalu, Sabah, Malaysia.
Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Audio, Speech and Language Processing, 3(1), 72–83.
Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39, 1161–1178.
Sarikiya, R., Pellom, B. L., & Hansen, J. H. L. (1998). Wavelet packet transform features with application to speaker identification. In Proc. IEEE nordic signal processing symposium (pp. 81–84).
Scherer, K. R. (1986). Vocal affect expression: a review and a model for future research. Psychological Bulletin, 99(2), 143–165.
Scherer, K. R. (2003). Vocal communication of emotion: a review of research paradigms. Speech Communication, 40, 227–256.
Scherer, K. R., Banse, R., & Wallbott, H. G. (2001). Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross-Cultural Psychology, 32(1), 76–92.
Scherer, K. R., Johnstone, T., & Klasmeyer, G. (2003). Vocal expression of emotion. In R. J. Davidson, K. R. Scherer, & H. H. Goldsmith (Eds.), Handbook of affective science (1st ed.). Oxford: Oxford University Press. Part IV, Chap. 23.
Singha, D. (2003). The phonology & morphology of Dimasa. M.A. Dissertation, Assam University, Silchar, Assam, India.
Singha, D. (2008). An introduction to Dimasa phonology. New Delhi: Saujanya Books.
Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: resources, features, and methods. Speech Communication, 48, 1162–1181.
Ververidis, D., Kotropoulos, C., & Pitas, I. (2004). Automatic emotional speech classification. In ICASSP, 2004 (pp. I-593–I-596).
Vogt, T., & Andre, E. (2005). Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition. In Proc. IEEE.
Wang, Y., & Guan, L. (2004). An investigation of speech-based human emotion recognition. In IEEE 6th workshop on multimedia signal processing (pp. 15–18).