Cảm nhận phân cấp không gian và học tập số liệu đối tượng khó cho phát hiện đối tượng trong hình ảnh viễn thám độ phân giải cao

Springer Science and Business Media LLC - Tập 52 - Trang 3193-3208 - 2021
Dongjun Zhu1,2, Shixiong Xia1,2, Jiaqi Zhao1,2, Yong Zhou1,2, Qiang Niu1,2, Rui Yao1,2, Ying Chen1,2
1School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
2Engineering Research Center of Mine Digitization, Ministry of Education of the Peoples Republic of China, Xuzhou, China

Tóm tắt

Do các góc chụp, độ cao và cảnh sắc khác nhau, hình ảnh viễn thám chứa nhiều bối cảnh phức tạp và các đối tượng đa quy mô. Hơn nữa, các đối tượng trong hình ảnh viễn thám thường nhỏ hơn nhiều so với bối cảnh, dễ bị che khuất bởi các tòa nhà và cây cối. Điều này gây khó khăn trong việc trích xuất đặc trưng và làm tăng sự đa dạng trong cùng một lớp của các đối tượng, khiến cho việc phát hiện đối tượng trên hình ảnh viễn thám trở nên thách thức hơn. Trong bài báo này, chúng tôi đề xuất một phương pháp phát hiện đối tượng trong hình ảnh viễn thám mới (SHDet) dựa trên thành phần cảm nhận phân cấp không gian (SHPC) và học tập số liệu đối tượng khó (HSML). Chúng tôi thiết kế SHPC để trích xuất đặc trưng dưới các phân cấp không gian khác nhau và học trọng số đóng góp giữa các kênh đặc trưng nhằm tăng cường khả năng biểu diễn đặc trưng. HSML được đề xuất để thu hẹp sự khác biệt về đặc trưng của các mẫu khó trong cùng một loại, giảm thiểu sai số phát hiện do sự đa dạng trong cùng lớp. Bên cạnh đó, chúng tôi tách rời bối cảnh phức tạp để xây dựng các tập dữ liệu tiền huấn luyện cho việc tiền huấn luyện mô hình phát hiện đối tượng, củng cố việc học tập đặc trưng của đối tượng. Các thí nghiệm được tiến hành trên hai tập dữ liệu viễn thám được sử dụng rộng rãi (NWPU VHR-10 và DOTA-v1.5) cho thấy phương pháp đề xuất có hiệu suất phát hiện tốt hơn so với một số phương pháp phát hiện đối tượng tiên tiến nhất hiện nay.

Từ khóa

#viễn thám #phát hiện đối tượng #đặc trưng #phân cấp không gian #học số liệu khó

Tài liệu tham khảo

Tao H (2020) Detecting smoky vehicles from traffic surveillance videos based on dynamic features. Appl Intell 50(4):1057–1072 Zhang G, Shijian L, Cad-net WZ (2019) A context-aware detection network for objects in remote sensing imagery. IEEE Trans Geosci Remote Sens 57(12):10015–10024 Rashidian V, Baise LG, Koch M (2019) Detecting collapsed buildings after a natural hazard on vhr optical satellite imagery using u-net convolutional neural networks Liang X, Zhang J, Zhuo L, Li Y, Tian Q (2019) Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Transactions on Circuits and Systems for Video Technology Wu X, Hong D, Tian J, Chanussot Jx, Li W, Ran T (2019) Orsim detector: A novel object detection framework in optical remote sensing imagery using spatial-frequency channel features. IEEE Trans Geosci Remote Sens 57(7):5146–5158 Bin J, Cong Y, Zhou W, Guoqing W (2014) A new method for detection of ship docked in harbor in high resolution remote sensing image. In: IEEE International conference on progress in informatics and computing, pp 341–344, IEEE Yokoya N, Iwasaki A (2015) Object detection based on sparse representation and hough voting for optical remote sensing imagery. IEEE J Select Topics Appl Earth Obser Remote Sens 8(5):2053–2062 Ge L, Zhang Y, Zheng X, Sun X, Kun F, Wang H (2013) A new method on inshore ship detection in high-resolution satellite images using shape and context information. IEEE Geosci Remote Sens Lett 11(3):617–621 David GL (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2. Ieee, pp 1150–1157 Chen Z, Wang C, Wen C, Teng X, Chen Y, Guan H, Luo H, Cao L, Li J (2015) Vehicle detection in high-resolution aerial images via sparse representation and superpixels. IEEE Trans Geosci Remote Sens 54(1):103–116 Qiu S, Wen G, Fan Y (2017) Occluded object detection in high-resolution remote sensing images using partial configuration object model. IEEE J Select Top Appl Earth Observ Remote Sen 10(5):1909–1925 Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition, pp 1–8, IEEE Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Ding X, Li Q, Yongqiang C, Jinbao W, Weixin B, Biao J (2020) Local keypoint-based faster r-cnn. Applied Intelligence Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems 91–99 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 Wei L, Dragomir A, Dumitru E, Christian S, Scott R, Fu C-Y, Berg AC (2016). In: European conference on computer vision, pp21–37. Springer Tang T, Zhou S, Deng Z, Zou H, Lei L (2017) Vehicle detection in aerial images based on region convolutional neural networks and hard negative example mining. Sensors 17(2):336 Wang G, Zhuang Y, Wang Z, Chen H, Shi H, Chen L (2019) Spatial enhanced-ssd for multiclass object detection in remote sensing images. In: IGARSS 2019-2019 IEEE international geoscience and remote sensing symposium, pp 318–321. IEEE Xie Y, Cai J, Bhojwani R, Shekhar S, Knight J (2020) A locally-constrained yolo framework for detecting small and densely-distributed building footprints. Int J Geogr Inf Sci 34(4):777– 801 Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125 Tian Z, Shen C, Chen H, Fcos TH (2019) Fully convolutional one-stage object detection. In: Proceedings of the IEEE international conference on computer vision, pp 9627–9636 Lu L, Wu D, Wu T, Faliang H, Yaohua Y (2020) Anchor-free multi-orientation text detection in natural scene images. Appl Intell 50(11):3623–3637 Cheng G, Han J, Zhou P, Guo L (2014) Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS J Photogramm Remote Sens 98:119–132 Xia GS, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: A large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3974–3983 Girshick Ross, Donahue Jeff, Darrell Trevor, Malik Jitendra (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Dai J, Yi L, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387 He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969 Mao Q-C, Sun H-M, Zuo L-Q, Jia R-S (2020) Finding every car: a traffic surveillance multi-scale vehicle object detection method. Applied Intelligence Liu S, Di H, et al. (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the european conference on computer vision (ECCV), pp 385–400 Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659 Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307 Xiongwei W, Sahoo D, Hoi SCH (2020) Recent advances in deep learning for object detection. Neurocomputing 396:39–64 Li K, Gong C, Bu S, Xiong Y (2017) Rotation-insensitive and context-augmented object detection in remote sensing images. IEEE Trans Geosci Remote Sens 56(4):2337–2348 Zhong Y, Han X, Zhang L (2018) Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery. ISPRS J Photogramm Remote Sens 138:281–294 Yang F, Fan H, Chu P, Blasch E, Ling H (2019) Clustered object detection in aerial images. In: Proceedings of the IEEE international conference on computer vision, pp 8311–8320 Zheng Z, Zhong Y, Ma A, Han X, Ji Z, Liu Y, Zhang L (2020) Hynet: Hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery. ISPRS J Photogramm Remote Sens 166:1–14 Cheng G, Zhou P, Han J (2016) Learning rotation-invariant convolutional neural networks for object detection in vhr optical remote sensing images. IEEE Trans Geosci Remote Sens 54(12):7405–7415 Cheng G, Si Y, Hong H, Yao X, Guo L (2020) Cross-scale feature fusion for object detection in optical remote sensing images. IEEE Geosci Remote Sens Lett 1–5 Dong R, Xu D, Zhao J, Jiao L, An J (2019) Sig-nms-based faster r-cnn combining transfer learning for small target detection in vhr optical remote sensing imagery. IEEE Trans Geosci Remote Sens 57 (11):8534–8545 Chen H, Zhang L, Ma J, Zhang J (2019) Target heat-map network: An end-to-end deep network for target detection in remote sensing images. Neurocomputing 331:375–387 Tang T, Zhou S, Deng Z, Lei L, Zou H (2017) Arbitrary-oriented vehicle detection in aerial imagery with single convolutional neural networks. Remote Sens 9(11):1170 Zhang W, Jiao L, Liu X, Liu J (2019) Multi-scale feature fusion network for object detection in vhr optical remote sensing images. In: IGARSS 2019-2019 IEEE international geoscience and remote sensing symposium, pp 330–333. IEEE Xie W, Qin H, Li Y, Wang Z, Lei J (2019) A novel effectively optimized one-stage network for object detection in remote sensing imagery. Remote Sens 11(11):1376 Chen L-C, Papandreou G, Florian S, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 Zheng Z, Zhong Y, Wang J, Ma A (2020) Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4096–4105 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988 Hu J, Shen L, Gang S (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141 Li X, Wang W, Xiaolin H, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 510–519 Rahman MA, Wang Y (2016) Optimizing intersection-over-union in deep neural networks for image segmentation. In: International symposium on visual computing, pp 234–244. Springer Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761–769 Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338 He Kaiming, Zhang Xiangyu, Ren Shaoqing, Sun Jian (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 Cai Z, Vasconcelos N (2019) Cascade r-cnn: High quality object detection and instance segmentation, IEEE Trans Pattern Anal Mach Intell 1–1 Pang J, Chen K, Shi J, Feng H, Ouyang W, Dahua L (2019) Libra r-cnn: Towards balanced learning for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830 Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) Augfpn: Improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604 Guo J, Han K, Wang Y, Zhang C, Yang Z, Han W, Chen X, Chang X (2020) Hit-detector: Hierarchical trinity architecture search for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11405– 11414 Kong T, Sun F, Liu H, Jiang Y, Li L, Jianbo Shi. (2020) Foveabox: Beyound anchor-based object detection. IEEE Trans Image Process 29:7389–7398 Li K, Cheng G, Bu S, You X (2018) Rotation-insensitive and context-augmented object detection in remote sensing images. IEEE Trans Geosci Remote Sens 56(4):2337–2348 Wu X, Hong D, Ghamisi P, Li W, Ran Tao (2018) Msri-ccf: Multi-scale and rotation-insensitive convolutional channel features for geospatial object detection. Remote Sens 10(12): 1990 Wu Y, Zhang K, Wang J, Wang Y, Wang Q, Li Q (2020) Cdd-net: A context-driven detection network for multiclass object detection. IEEE Geoscience and Remote Sensing Letters