Bi-attention network for bi-directional salient object detection

Springer Science and Business Media LLC - Tập 53 - Trang 21500-21516 - 2023
Cheng Xu1, Hui Wang1, Xianhui Liu1, Weidong Zhao1
1School of Electronics and Information Engineering, Tongji University, Shanghai, China

Tóm tắt

Saliency detection models based on neural networks have achieved outstanding results, but there are still problems such as low accuracy of object boundaries and redundant parameters. To alleviate these problems, we make full use of position and contour information from the down-sampling layers, and optimize the detection result layer by layer. First, this paper designs an attention-based adaptive fusion module (AAF), which can suppress the background and highlight the foreground that is more relevant to the detection task. It automatically learns the fusion weights of different features to filter out conflict information. Second, this paper proposes a bi-attention block module which combines reverse attention and positive attention. Third, this paper introduces bi-directional task learning by decomposing the image into high-frequency and low-frequency components. This approach fully exploits the complementary and independent nature of different frequency information. Finally, the proposed method is compared with 14 state-of-the-art methods on 6 datasets, and achieves very competitive results. Additionally, the model size is only 114.19MB, and the inference speed can reach nearly 40 FPS.

Tài liệu tham khảo

Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H (2010) Learning to detect a salient object. IEEE Transactions on Pattern analysis and machine intelligence 33(2):353–367 Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646 Liao K, Wang K, Zheng Y, Lin G, Cao C (2023) Multi-scale saliency features fusion model for person re-identification. Multimedia Tools and Applications 1–16 Yan S, Peng L, Yu C, Yang Z, Liu H, Cai D (2022) Domain reconstruction and resampling for robust salient object detection. In: Proceedings of the 30th ACM international conference on multimedia, pp. 5417–5426 Wang W, Shen J, Ling H (2018) A deep network solution for attention and aesthetics aware photo cropping. IEEE transactions on pattern analysis and machine intelligence 41(7):1531–1544 Benli E, Motai Y, Rogers J (2019) Visual perception for multiple human-robot interaction from motion behavior. IEEE Systems Journal 14(2):2937–2948 Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era: An in-depth survey. IEEE transactions on pattern analysis and machine intelligence 44(6):3239–3259 Wu R, Feng M, Guan W, Wang D, Lu H, Ding E (2019) A mutual learning method for salient object detection with intertwined multi-supervision. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8150–8159 Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7479–7489 Pang Y, Zhao X, Zhang L, Lu H (2020) Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9413–9422 Chen S, Tan X, Wang B, Lu H, Hu X, Fu Y (2020) Reverse attention-based residual network for salient object detection. IEEE transactions on image processing 29:3763–3776 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: proceedings of the ieee conference on computer vision and pattern recognition, pp. 770–778 Zhang X, Wang T, Qi J, Lu H, Wang G (2018) Progressive attention guided recurrent network for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 714–722 Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3085–3094 Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1741–1750 Susladkar O, Deshmukh G, Nag S, Mantravadi A, Makwana D, Ravichandran S, Chavhan GH, Mohan CK, Mittal S et al (2022) Clarifynet: A high-pass and low-pass filtering based cnn for single image dehazing. Journal of systems architecture 132:102736 Chen Y, Fan H, Xu B, Yan Z, Kalantidis Y, Rohrbach M, Yan S, Feng J (2019) Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3435–3444 Wang C, Li C, Liu J, Luo B, Su X, Wang Y, Gao Y (2021) U2-onet: A two-level nested octave u-structure network with a multi-scale attention mechanism for moving object segmentation. Remote sensing 13(1):60 Hu P, Shuai B, Liu J, Wang G (2017) Deep level sets for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2300–2309 Luo Z, Mishra A, Achkar A, Eichel J, Li S, Jodoin P (2017) Non-local deep features for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6609–6617 Zhang P, Wang D, Lu H, Wang H, Yin B (2017) Learning uncertain convolutional features for accurate saliency detection. In: Proceedings of the IEEE international conference on computer vision, pp. 212–221 Xiao H, Feng J, Wei Y, Zhang M, Yan S (2018) Deep salient object detection with dense connections and distraction diagnosis. IEEE transactions on multimedia 20(12):3239–3251 Tu Z, Ma Y, Li C, Tang J, Luo B (2020) Edge-guided non-local fully convolutional network for salient object detection. IEEE transactions on circuits and systems for video technology 31(2):582–593 Fan D, Zhou T, Ji G, Zhou Y, Chen G, Fu H, Shen J, Shao L (2020) Inf-net: Automatic covid-19 lung infection segmentation from ct images. IEEE transactions on medical imaging 39(8):2626–2637 Wei J, Wang S, Wu Z, Su C, Huang Q, Tian Q (2020) Label decoupling framework for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13025–13034 Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40(4):834–848 Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2881–2890 Wang T, Borji A, Zhang L, Zhang P, Lu H (2017) A stagewise refinement model for detecting salient objects in images. In: Proceedings of the IEEE international conference on computer vision, pp. 4019–4028 Deng Z, Hu X, Zhu L, Xu X, Qin J, Han G, Heng P (2018) R3net: Recurrent residual refinement network for saliency detection. In: Proceedings of the 27th international joint conference on artificial intelligence, pp. 684–690 Hou Q, Cheng M, Hu X, Borji A, Tu Z, Torr PH (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3203–3212 Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: Aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 202–211 Ke W, Chen J, Jiao J, Zhao G, Ye Q (2017) Srn: Side-output residual network for object symmetry detection in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1068–1076 Wang T, Zhang L, Wang S, Lu H, Yang G, Ruan X, Borji A (2018) Detect globally, refine locally: A novel approach to saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3127–3135 Hommel B, Chapman CS, Cisek P, Neyedli HF, Song J, Welsh TN (2019) No one knows what attention is. Attention, perception, & psychophysics 81:2288–2303 Wang W, Zhao S, Shen J, Hoi SC, Borji A (2019) Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1448–1457 Chen S, Tan X, Wang B, Hu X (2018) Reverse attention for salient object detection. In: Proceedings of the european conference on computer vision (eccv), pp. 234–250 Chen D, Zhang S, Ouyang W, Yang J, Tai Y (2018) Person search via a mask-guided two-stream cnn model. In: Proceedings of the european conference on computer vision (eccv), pp. 734–750 Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the european conference on computer vision (eccv), pp. 286–301 Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3156–3164 Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 Woo S, Park J, Lee J, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the european conference on computer vision (eccv), pp. 3–19 Liu Y, Zhang X, Bian J, Zhang L, Cheng M (2021) Samnet: Stereoscopically attentive multi-scale network for lightweight salient object detection. IEEE transactions on image processing 30:3804–3814 Yang L, Zhang R, Li L, Xie X (2021) Simam: A simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning, pp. 11863–11874 Liu N, Han J, Yang M (2018) Picanet: Learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3089–3098 Li G, Xie Y, Lin L, Yu Y (2017) Instance-level salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2386–2395 Li T, Song H, Zhang K, Liu Q (2020) Recurrent reverse attention guided residual learning for saliency object detection. Neurocomputing 389:170–178 Zhang Z, Lin Z, Xu J, Jin W, Lu S, Fan D (2021) Bilateral attention network for rgb-d salient object detection. IEEE transactions on image processing 30:1949–1961 Li J, Pan Z, Liu Q, Cui Y, Sun Y (2020) Complementarity-aware attention network for salient object detection. IEEE transactions on cybernetics 52(2):873–886 Lee C, Xie S, Gallagher P, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: Artificial intelligence and statistics, pp. 562–570 De Boer P, Kroese DP, Mannor S, Rubinstein RY (2005) A tutorial on the cross-entropy method. Annals of operations research 134(1):19–67 Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: The thrity-seventh asilomar conference on signals, systems & computers, 2003, vol. 2, pp. 1398–1402 Máttyus G, Luo W, Urtasun R (2017) Deeproadmapper: Extracting road topology from aerial images. In: Proceedings of the IEEE international conference on computer vision, pp. 3438–3446 Li Y, Hou X, Koch C, Rehg JM, Yuille AL (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 280–287 Shi J, Yan Q, Xu L, Jia J (2015) Hierarchical image saliency detection on extended cssd. IEEE transactions on pattern analysis and machine intelligence 38(4):717–729 Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings eighth IEEE international conference on computer vision. iccv 2001, vol. 2, pp. 416–423 Li G, Yu Y (2016) Visual saliency detection based on multiscale deep cnn features. IEEE transactions on image processing 25(11):5012–5024 Yang C, Zhang L, Lu H, Ruan X, Yang M (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3166–3173 Wang L, Lu H, Wang Y, Feng M, Wang D, Y (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 136–145 Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Fequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 1597–1604 Perazzi F, Krähenbühl P, Pritch Y, Hornung A (2012) Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 733–740 Fan D, Cheng M, Liu Y, Li T, Borji A (2017) Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp. 4548–4557 Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32:8026–8037 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25:1097–1105 Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1623–1632 Zeng Y, Zhuge Y, Lu H, Zhang L, Qian M, Yu Y (2019) Multi-source weak supervision for saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6074–6083 Zhang L, Zhang J, Lin Z, Lu H, He Y (2019) Capsal: Leveraging captioning to boost semantics for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6024–6033 Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3907–3916 Qin X, Zhang Z, Huang C, Dehghan M, Zaiane OR, Jagersand M (2020) U2-net: Going deeper with nested u-structure for salient object detection. Pattern recognition 106:107404 Zhou H, Xie X, Lai J, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9141–9150 Liu Y, Gu Y, Zhang X, Wang W, Cheng M (2020) Lightweight salient object detection via hierarchical visual perception learning. IEEE transactions on cybernetics Zhang M, Liu T, Piao Y, Yao S, Lu H (2021) Auto-msfnet: Search multi-scale fusion network for salient object detection. In: Proceedings of the 29th ACM international conference on multimedia, pp. 667–676 Tang L, Li B, Zhong Y, Ding S, Song M (2021) Disentangled high quality salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3580–3590 Wu Z, Su L, Huang Q (2021) Decomposition and completion network for salient object detection. IEEE transactions on image processing 30:6226–6239 Ke YY, Tsubono T (2022) Recursive contour-saliency blending network for accurate salient object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 2940–2950