Fusion based feature reinforcement component for remote sensing image object detection

Multimedia Tools and Applications - Tập 79 - Trang 34973-34992 - 2020
Dongjun Zhu1,2, Shixiong Xia1,2, Jiaqi Zhao1,2, Yong Zhou1,2, Qiang Niu1,2, Rui Yao1,2, Ying Chen1,2
1School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
2Engineering Research Center of Mine Digitization, Ministry of Education of the Peoples Republic of China, Xuzhou, China

Tóm tắt

In recent years, convolutional neural networks (CNN) have been extensively used for generic object detection due to their powerful feature extraction capabilities. This has hence motivated researchers to adopt this technology in the field of remote sensing. However, remote sensing images can contain large amounts of noise, have complex backgrounds, include small dense objects as well as being susceptible to weather and light intensity variations. Moreover, from different shooting angles, objects can either have different shapes or be obscured by structures such as buildings and trees. Due to these, effective features extraction for proper representation is still very challenging from remote sensing images. This paper therefore proposes a novel remote sensing image object detection approach applying a fusion-based feature reinforcement component (FB-FRC) to improve the discrimination between object feature. Specifically, two fusion strategies are proposed: (i) a hard fusion strategy through artificially-set rules, and (ii) a soft fusion strategy by learning the fusion parameters. Experiments carried out on four widely used remote sensing datasets (NWPU VHR-10, VisDrone2018, DOTA and RSOD) have shown promising results where the proposed approach manages to outperform several state-of-the-art methods.

Tài liệu tham khảo

Cheng G, Han J (2016) A survey on object detection in optical remote sensing images. ISPRS J Photogramm Remote Sens 117:11–28 Cheng G, Han J, Guo L, Qian X, Zhou P, Yao X, Xintao Hu (2013) Object detection in remote sensing imagery using a discriminatively trained mixture model. ISPRS J Photogramm Remote Sens 85:32–43 Cheng G, Han J, Zhou P, Guo L (2014) Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS J Photogramm Remote Sens 98:119–132 Dai J, Yi Li, He K, Sun J (2016) R-fcn: object detection via region-based fully convolutional networks. Advances in neural information processing systems, pp 379–387 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005, vol 1. IEEE, pp 886–893 Esmael AA, Santos JAD, Torres RDS (2018) On the ensemble of multiscale object-based classifiers for aerial images: a comparative study. Multimed Tools Appl 77(11):1–28 Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338 Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Han J, Zhang D, Cheng G, Guo L, Ren J (2015) Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning. IEEE Trans Geosci Remote Sens 53(6):3325–3337 Haralick RM, Shanmugam K, et al (1973) Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics (6): 610–621 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Li X, Wang S (2017) Object detection using convolutional neural networks in a coarse-to-fine manner. IEEE Geosci Remote Sens Lett 14(11):2037–2041 Lin T-Y, Dollár P, Girshick RB, He K, Hariharan B, Belongie SJ (2017) Feature pyramid networks for object detection. In: CVPR, vol 1, p 4 Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, Berlin, pp 740–755 Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2019) Deep learning for generic object detection: a survey. International Journal of Computer Vision Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, Berlin, pp 21–37 Long Y, Gong Y, Xiao Z, Liu Q (2017) Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Trans Geosci Remote Sens 55(5):2486–2498 Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110 Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: International conference on pattern recognition Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. pp 91–99 Russakovsky O, Deng J, Hao Su, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252 Senaras C, Ozay M, Vural FTY (2013) Building detection with decision fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 6(3):1295–1304 Tao Q, Zhang Q, Sun S (2017) Vehicle detection from high-resolution aerial images using spatial pyramid pooling-based deep convolutional neural networks. Multimed Tools Appl 76(20):21651–21663 Wang C, Bai X, Wang S, Zhou J, Ren P (2018) Multiscale visual attention networks for object detection in vhr remote sensing images. IEEE Geoscience and Remote Sensing Letters Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: a large-scale dataset for object detection in aerial images. In: Proc. CVPR Xiao Z, Liu Q, Tang G, Zhai X (2015) Elliptic fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images. Int J Remote Sens 36(2):618–644 Zhu P, Wen L, Bian X, Ling H, Hu Q (2018) Vision meets drones: a challenge. arXiv:1804.07437