Visual sentiment analysis with semantic correlation enhancement

Complex & Intelligent Systems - Trang 1-13 - 2023

Hao Zhang¹, Yanan Liu¹, Zhaoyu Xiong¹, Zhichao Wu², Dan Xu¹

¹School of Information Science and Engineering, Yunnan University, Kunming, China

²School of Artificial Intelligence, Beijing Normal University, Beijing, China

Tóm tắt

Visual sentiment analysis is in great demand as it provides a computational method to recognize sentiment information in abundant visual contents from social media sites. Most of existing methods use CNNs to extract varying visual attributes for image sentiment prediction, but they failed to comprehensively consider the correlation among visual components, and are limited by the receptive field of convolutional layers as a result. In this work, we propose a visual semantic correlation network VSCNet, a Transformer-based visual sentiment prediction model. Precisely, global visual features are captured through an extended attention network stacked by a well-designed extended attention mechanism like Transformer. An off-the-shelf object query tool is used to determine the local candidates of potential affective regions, by which redundant and noisy visual proposals are filtered out. All candidates considered affective are embedded into a computable semantic space. Finally, a fusion strategy integrates semantic representations and visual features for sentiment analysis. Extensive experiments reveal that our method outperforms previous studies on 5 annotated public image sentiment datasets without any training tricks. More specifically, it achieves 1.8% higher accuracy on FI benchmark compared with other state-of-the-art methods.

Tài liệu tham khảo

Bhandari A, Pal NR (2021) Can edges help convolution neural networks in emotion recognition? Neurocomputing 433:162–168. https://doi.org/10.1016/j.neucom.2020.12.092

Borth D, Chen T, Ji R, Chang SF (2013) Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content. In: Proceedings of the 21st ACM international conference on multimedia, association for computing machinery, New York, NY, USA. pp 459-460. https://doi.org/10.1145/2502081.2502268

Chen T, Borth D, Darrell T, Chang S (2014) Deepsentibank: visual sentiment concept classification with deep convolutional neural networks. CoRR abs/1410.8586. arXiv:1410.8586

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth \(16\times 16\) words: transformers for image recognition at scale. CoRR abs/2010.11929. arXiv:2010.11929

Guo MH, Lu CZ, Liu ZN, Cheng MM, Hu SM (2023) Visual attention network. Comp Visual Media. https://doi.org/10.1007/s41095-023-0364-2

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778

He X, Zhang H, Li N, Feng L, Zheng F (2019) A multi-attentive pyramidal model for visual sentiment analysis. In: 2019 international joint conference on neural networks (IJCNN). pp 1–8. https://doi.org/10.1109/IJCNN.2019.8852317

He X, Zhang W (2018) Emotion recognition by assisted learning with convolutional neural networks. Neurocomputing 291:187–194. https://doi.org/10.1016/j.neucom.2018.02.073

Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 4700–4708

Kartheek MN, Prasad MVNK, Bhukya R (2021) Modified chess patterns: handcrafted feature descriptors for facial expression recognition. Complex Intell Syst 7:3303–3322. https://doi.org/10.1007/s40747-021-00526-3

Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2022) Transformers in vision: a survey. ACM Comput Surv. https://doi.org/10.1145/3505244

Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60:84–90. https://doi.org/10.1145/3065386

Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp 10012–10022

Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM international conference on multimedia, association for computing machinery, New York, NY, USA. pp 83-92. https://doi.org/10.1145/1873951.1873965

Mikels J, Fredrickson B, Samanez-Larkin G, Lindberg C, Maglio S, Reuter-Lorenz P (2005) Emotional category data on images from the international affective picture system. Behav Res Methods 37:626–30. https://doi.org/10.3758/BF03192732

Ou H, Qing C, Xu X, Jin J (2021) Multi-level context pyramid network for visual sentiment analysis. Sensors 21. https://www.mdpi.com/1424-8220/21/6/2136. https://doi.org/10.3390/s21062136

Peng KC, Chen T, Sadovnik A, Gallagher AC (2015) A mixed bag of emotions: model, predict, and transfer emotion distributions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Rao T, Li X, Xu M (2020) Learning multi-level deep representations for image emotion classification. Neural Process Lett 51:1–19. https://doi.org/10.1007/s11063-019-10033-9

Rao T, Li X, Zhang H, Xu M (2019) Multi-level region-based convolutional neural network for image emotion classification. Neurocomputing 333:429–439. https://doi.org/10.1016/j.neucom.2018.12.053

She D, Sun M, Yang J (2019) Learning discriminative sentiment representation from strongly- and weakly supervised CNNs. ACM Trans Multimedia Comput Commun Appl. https://doi.org/10.1145/3326335

She D, Yang J, Cheng MM, Lai YK, Rosin PL, Wang L (2020) Wscnet: weakly supervised coupled networks for visual sentiment classification and detection. IEEE Trans Multimedia 22:1358–1371. https://doi.org/10.1109/TMM.2019.2939744

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

Srinivas A., Lin TY, Parmar N, Shlens J, Abbeel P, Vaswani A (2021) Bottleneck transformers for visual recognition. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR). pp 16514–16524. https://doi.org/10.1109/CVPR46437.2021.01625

Szeged C, Ioffe S, Vanhoucke V, Alemi A (2016a) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI conference on artificial intelligence

Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016b) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

Wu YH, Liu Y, Zhan X, Cheng MM (2022) P2t: pyramid pooling transformer for scene understanding. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3202765

Xiong H, Liu Q, Song S, Cai Y (2019) Region-based convolutional neural network using group sparse regularization for image sentiment classification. EURASIP J Image Video Process. https://doi.org/10.1186/s13640-019-0433-8

Xu L, Wang Z, Wu B, Lui S (2022a) Mdan: Multi-level dependent attention network for visual emotion analysis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). pp 9479–9488

Xu Y, Su H, Ma G, Liu X (2022) A novel dual-modal emotion recognition algorithm with fusing hybrid features of audio signal and speech context. Complex Intell Syst 9:951–963. https://doi.org/10.1007/s40747-022-00841-3

Yadav A, Vishwakarma DK (2020) A deep learning architecture of RA-DLNet for visual sentiment analysis. Multimedia Syst 26:431–451. https://doi.org/10.1007/s00530-020-00656-7

Yamamoto T, Takeuchi S, Nakazawa A (2021) Image emotion recognition using visual and semantic features reflecting emotional and similar objects. IEICE Trans Inf Syst 104:1691–1701. https://doi.org/10.1587/transinf.2020EDP7218

Yang H, Fan Y, Lv G, Liu S, Guo Z (2022) Exploiting emotional concepts for image emotion recognition. Visual Comput. https://doi.org/10.1007/s00371-022-02472-8

Yang J, Li J, Wang X, Ding Y, Gao X (2021) Stimuli-aware visual emotion analysis. IEEE Trans Image Process 30:7432–7445. https://doi.org/10.1109/TIP.2021.3106813. arXiv:2109.01812

Yang J, She D, Lai YK, Yang MH (2018) Retrieving and classifying affective images via deep metric learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 32. https://ojs.aaai.org/index.php/AAAI/article/view/11275

Yang J, She D, Sun M (2017) Joint image emotion classification and distribution learning via deep convolutional neural network. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17, pp 3266–3272. https://doi.org/10.24963/ijcai.2017/456

Yang J, She D, Sun M, Cheng MM, Rosin PL, Wang L (2018) Visual sentiment prediction based on automatic discovery of affective regions. IEEE Trans Multimedia 20:2513–2525. https://doi.org/10.1109/TMM.2018.2803520

Yanulevskaya V, van Gemert J, Roth K, Herbold A, Sebe N, Geusebroek J (2008) Emotional valence categorization using holistic image features. In: 2008 15th IEEE international conference on image processing. pp 101–104. https://doi.org/10.1109/ICIP.2008.4711701

You Q, Luo J, Jin H, Yang J (2015) Robust image sentiment analysis using progressively trained and domain transferred deep networks. In: Twenty-ninth AAAI conference on artificial intelligence

You Q, Luo J, Jin H, Yang J (2016) Building a large scale dataset for image emotion recognition: the fine print and the benchmark. In: Proceedings of the AAAI conference on artificial intelligence, vol 30. https://ojs.aaai.org/index.php/AAAI/article/view/9987

Zhang H, Xu D, Luo G, He K (2022) Learning multi-level representations for affective image recognition. Neural Comput App. https://doi.org/10.1007/s00521-022-07139-y

Zhang H, Xu D, Luo G, He K (2022) Learning multi-level representations for affective image recognition. Neural Comput Appl. https://doi.org/10.1007/s00521-022-07139-y

Zhang J, Chen M, Sun H, Li D, Wang Z (2020) Object semantics sentiment correlation analysis enhanced image sentiment classification. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2019.105245

Zhang W, He X, Lu W (2020) Exploring discriminative representations for image emotion recognition with CNNs. IEEE Trans Multimedia 22:515–523. https://doi.org/10.1109/TMM.2019.2928998

Zhao S (2016) Image emotion computing. In: Proceedings of the 24th ACM international conference on multimedia, association for computing machinery, New York, NY, USA. pp 1435–1439. https://doi.org/10.1145/2964284.2971473

Zhao S, Gao Y, Jiang X, Yao H, Chua TS, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM international conference on multimedia, association for computing machinery, New York, NY, USA. pp 47–56. https://doi.org/10.1145/2647868.2654930

Zhao S, Jia Z, Chen H, Li L, Ding G, Keutzer K (2019) Pdanet: polarity-consistent deep attention network for fine-grained visual emotion regression. In: Proceedings of the 27th ACM international conference on multimedia, association for computing machinery, New York, NY, USA. pp 192–201. https://doi.org/10.1145/3343031.3351062

Zhao S, Yao X, Yang J, Jia G, Ding G, Chua TS, Schuller BW, Keutzer K (2021) Affective image content analysis: two decades review and new perspectives. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3094362

Zhu X, Li L, Zhang W, Rao T, Xu M, Huang Q, Xu D (2017) Dependency exploitation: a unified CNN–RNN approach for visual emotion recognition. In: Proceedings of the 26th international joint conference on artificial intelligence. AAAI Press, Washington, DC. pp 3595–3601

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA