Learning multi-level representations for affective image recognition

Neural Computing and Applications - Tập 34 - Trang 14107-14120 - 2022
Hao Zhang1, Dan Xu1, Gaifang Luo2, Kangjian He1
1School of Information Science and Engineering, Yunnan University, Kunming, China
2School of Software, Shanxi Agricultural University, Jinzhong, China

Tóm tắt

Images can convey intense affective experiences and affect people on an affective level. With the prevalence of online pictures and videos, evaluating emotions from visual content has attracted considerable attention. Affective image recognition aims to classify the emotions conveyed by digital images automatically. The existing studies using manual features or deep networks mainly focus on low-level visual features or high-level semantic representation without considering all factors. To better understand how deep networks are working for affective recognition tasks, we investigate the convolutional features by visualization them in this work. Our research shows that the hierarchical CNN model mainly relies on deep semantic information while ignoring the shallow visual details, which are essential to evoke emotions. To form a more general and discriminative representation, we propose a multi-level hybrid model that learns and integrates the deep semantics and shallow visual representations for sentiment classification. In addition, this study shows that class imbalance would affect performance as the main category of the affective dataset will overwhelm training and degenerate the deep networks. Therefore, a new loss function is introduced to optimize the deep affective model. Experimental results on several affective image recognition datasets show that our model outperforms various existing studies. The source code is publicly available.

Tài liệu tham khảo

Yuan J, Mcdonough S, You Q, et al (2013) Sentribute: image sentiment analysis from a mid-level perspective. In: Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining. pp 1–8

He K, Gkioxari G, Dollár P, et al (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp 2961–2969

Borth D, Chen T, Ji R, et al (2013) Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content[C]//Proceedings of the 21st ACM international conference on Multimedia. pp 459–460

Chan LKC, Jegadeesh N, Lakonishok J (1996) Momentum strategies. J Financ 51(5):1681–1713