Learning Multi-level Deep Representations for Image Emotion Classification

Springer Science and Business Media LLC - Tập 51 - Trang 2043-2061 - 2019
Tianrong Rao1, Xiaoxu Li1, Min Xu1
1GBDTC, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, Australia

Tóm tắt

In this paper, we propose a new deep network that learns multi-level deep representations for image emotion classification (MldrNet). Image emotion can be recognized through image semantics, image aesthetics and low-level visual features from both global and local views. Existing image emotion classification works using hand-crafted features or deep features mainly focus on either low-level visual features or semantic-level image representations without taking all factors into consideration. The proposed MldrNet combines deep representations of different levels, i.e. image semantics, image aesthetics and low-level visual features to effectively classify the emotion types of different kinds of images, such as abstract paintings and web images. Extensive experiments on both Internet images and abstract paintings demonstrate the proposed method outperforms the state-of-the-art methods using deep features or hand-crafted features. The proposed approach also outperforms the state-of-the-art methods with at least 6% performance improvement in terms of overall classification accuracy.

Tài liệu tham khảo

Alameda-Pineda X, Ricci E, Yan Y, Sebe N (2016) Recognizing emotions from abstract paintings using non-linear matrix completion. In: CVPR Andrearczyk V, Whelan PF (2016) Using filter banks in convolutional neural networks for texture classification. Pattern Recognit Lett 84:63–69 Aronoff J (2006) How we recognize angry and happy emotion in people, places, and things. Cross-Cult Res 40(1):83–105 Borth D, Ji R, Chen T, Breuel T, Chang SF (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: ACM MM Chen CH, Patel VM, Chellappa R (2015) Matrix completion for resolving label ambiguity. In: CVPR Chen T, Yu FX, Chen J, Cui Y, Chen YY, Chang SF (2014) Object-based visual sentiment concept analysis and application. In: ACM MM Cui Z, Shi X, Chen Y (2016) Sentiment analysis via integrating distributed representations of variable-length word sequence. Neurocomputing 187:126–132 Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: CVPR Hanjalic A (2006) Extracting moods from pictures and sounds: towards truly personalized TV. IEEE Signal Process Mag 23(2):90–100 Hanjalic A, Xu LQ (2005) Affective video content representation and modeling. IEEE Trans Multimed 7(1):143–154 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778 Hu C, Xu Z, Liu Y, Mei L, Chen L, Luo X (2014) Semantic link network-based model for organizing multimedia big data. IEEE Trans Emerg Top Comput 2(3):376–387 Itten J, Van Haagen E (1962) The art of colorthe subjective experience and objective rationale of colour. Reinhold, New York Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML Joshi D, Datta R, Fedorovskaya E, Luong QT, Wang JZ, Li J, Luo J (2011) Aesthetics and emotions in images. IEEE Signal Process Mag 28(5):94–115 Jufeng Y, Ming S, Xiaoxiao S (2017) Learning visual sentiment distributions via augmented conditional probability neural network. In: AAAI Jufeng Y, Dongyu S, Ming S, Ming-Ming C, Rosin PL, Liang W (2018) Visual sentiment prediction based on automatic discovery of affective regions. TMM 99:1–1 Kang HB (2003) Affective content detection using HMMs. In: ACM MM Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS Lang PJ (1979) A bio-informational theory of emotional imagery. Psychophysiology 16(6):495–512 Lang PJ, Bradley MM, Cuthbert BN (2008) International affective picture system (IAPs): affective ratings of pictures and instruction manual. Technical report A-8 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: CVPR Lu X, Suryanarayan P, Adams Jr RB, Li J, Newman MG, Wang JZ (2012) On shape and the computability of emotions. In: ACM MM Lu X, Lin Z, Jin H, Yang J, Wang JZ (2014) Rapid: rating pictorial aesthetics using deep learning. In: ACM MM Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: ACM MM, pp 83–92 Mikels JA, Fredrickson BL, Larkin GR, Lindberg CM, Maglio SJ, Reuter-Lorenz PA (2005) Emotional category data on images from the international affective picture system. Behav Res Methods 37(4):626–630 Peng KC, Chen T, Sadovnik A, Gallagher AC (2015) A mixed bag of emotions: model, predict, and transfer emotion distributions. In: CVPR Poria S, Cambria E, Howard N, Huang GB, Hussain A (2016) Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174:50–59 Rao T, Xu M, Liu H, Wang J, Burnett I (2016) Multi-scale blocks based image emotion classification using multiple instance learning. In: ICIP Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS Sartori A, Culibrk D, Yan Y, Sebe N (2015) Who’s afraid of Itten: Using the art theory of color combination to analyze emotions in abstract paintings. In: ACM MM Shepstone SE, Tan ZH, Jensen SH (2014) Using audio-derived affective offset to enhance TV recommendation. IEEE Trans Multimed 16(7):1999–2010 Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci. arXiv:1409.1556 Soleymani M, Larson M, Pun T, Hanjalic A (2014) Corpus development for affective video indexing. IEEE Trans Multimed 16(4):1075–1089 Solli M, Lenz R (2009) Color based bags-of-emotions. In: CAIP Song K, Yao T, Ling Q, Mei T (2018) Boosting image sentiment analysis with visual attention. Neurocomputing 312:218–228 Sun X, Li C, Ren F (2016) Sentiment analysis for chinese microblog based on deep neural networks with convolutional extension features. Neurocomputing 210:227–236 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR Tkalcic M, Odic A, Kosir A, Tasic J (2013) Affective labeling in a content-based recommender system for images. IEEE Trans Multimed 15(2):391–400 Wang W, He Q (2008) A survey on emotional semantic image retrieval. In: ICIP Wei-ning W, Ying-lin Y, Jian-chao Z (2004) Image emotional classification: static vs. dynamic. In: SMC Xu M, Jin JS, Luo S, Duan L (2008) Hierarchical movie affective content analysis based on arousal and valence features. In: ACM MM Yadati K, Katti H, Kankanhalli M (2014) CAVVA: computational affective video-in-video advertising. IEEE Trans Multimed 16(1):15–23 Yanulevskaya V, Van Gemert J, Roth K, Herbold AK, Sebe N, Geusebroek JM (2008) Emotional valence categorization using holistic image features. In: ICIP Yanulevskaya V, Uijlings J, Bruni E, Sartori A, Zamboni E, Bacci F, Melcher D, Sebe N (2012) In the eye of the beholder: employing statistical analysis and eye tracking for analyzing abstract paintings. In: ACM MM You Q, Luo J, Jin H, Yang J (2016) Building a large scale dataset for image emotion recognition: the fine print and the benchmark. In: AAAI Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV Zhao S, Gao Y, Jiang X, Yao H, Chua TS, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: ACM MM Zhao S, Yao H, Gao Y, Ji R, Ding G (2017) Continuous probability distribution prediction of image emotions via multitask shared sparse regression. IEEE Trans Multimed 19(3):632–645 Zhao S, Ding G, Gao Y, Zhao X, Tang Y, Han J, Yao H, Huang Q (2018) Discrete probability distribution prediction of image emotions with shared sparse learning. IEEE Trans Affect Comput. https://doi.org/10.1109/TAFFC.2018.2818685 Zhao S, Yao H, Gao Y, Ding G, Chua TS (2018) Predicting personalized image emotion perceptions in social networks. IEEE Trans Affect Comput 9(4):526–540 Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: NIPS