A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception

Springer Science and Business Media LLC - Tập 53 - Trang 8114-8132 - 2022
Zhaisheng Ding1, Haiyan Li1, Dongming Zhou1, Yanyu Liu1, Ruichao Hou2
1School of Information and artificial intelligence, Yunnan University, Kunming, China
2State Key Laboratory for Novel Software Technology at Nanjing University, Nanjing, China

Tóm tắt

In this paper, a robust infrared and visible image fusion scheme that joins a dual-branch multi-receptive-field neural network and a color vision transfer algorithm is designed to aggregate infrared and visible video sequences. The proposed method enables the fused image to effectively recognize thermal objects, contain rich texture information and ensure visual perception quality. The fusion network is an integrated encoder-decoder modal with a multi-receptive-field attention mechanism that is implemented via hybrid dilated convolution (HDC) and a series of convolution layers to form an unsupervised framework. Specifically, the multi-receptive-field attention mechanism aims to extract comprehensive spatial information to enable the encoder to separately focus on the substantial thermal radiation from the infrared modal and the environmental characteristics from the visible modal. In addition, to ensure that the fused image has rich color, high fidelity and steady brightness, a color vision transfer method is proposed to recolor the fused gray results by deriving a map from the visible image serving as a reference. Extensive experiments verify the importance and robustness of each step in the subjective and objective evaluation and demonstrate that our work represents a trade-off among color fidelity, fusion performance and computational efficiency. Moreover, we will publish our research content, data and code publicly at https://github.com/DZSYUNNAN/RGB-TIR-image-fusion .

Tài liệu tham khảo

Ma J, Ma Y, Li C (2019) Infrared and visible image fusion methods and applications: A survey. Inf Fusion 45:153–178 Geng J, Miao Z, Zhang X (2015) Efficient heuristic methods for multimodal fusion and concept fusion in video concept detection. IEEE Trans Multimedia 17(4):498–511 Javan FD, Samadzadegan F, Mehravar S, Toosi A, Stein A (2021) A review of image fusion techniques for pan-sharpening of high-resolution satellite imagery. ISPRS J Photogramm Remote Sens 171:101–117 Hu H, Wu J, Li B, Guo Q, Zheng J (2017) An adaptive fusion algorithm for visible and infrared videos based on entropy and the cumulative distribution of gray levels. IEEE Trans Multimedia 19(12):2706–2719 Zhang Q, Wang L, Ma Z, Li H (2012) A novel video fusion framework using surfacelet transform. Opt Commun 285(13–14):3032–3041 Zhang Q, Chen Y, Wang L (2013) Multisensor video fusion based on spatial–temporal salience detection. Signal Process 93(9):2485–2499 Bin S, Yingjie L, Rongguo F (2020) Multi-Band infrared and visual video registration and fusion parallel acceleration method. Presented at the Proceedings of the 2020 International conference on computing, Networks and Internet of Things, Sanya, China, 107-112 Li J, Huo H, Li C, Wang R, Sui C, Liu Z (2021) Multigrained attention network for infrared and visible image fusion. IEEE Trans Instrum Meas 70:1–12 Zhang Q, Liu Y, Rick S (2018) Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: A review. Inf Fusion 40:57–75 Luo X, Zhang Z, Zhang B, Wu X (2017) Image fusion with contextual statistical similarity and nonsubsampled shearlet transform. IEEE Sensors J PP(6):1760–1771 Zhang TY, Zhou Q, Feng HJ, Xu ZH, Li Q, Chen YT (2013) Fusion of infrared and visible light images based on nonsubsampled shearlet transform. Proc SPIE 8907, id. 89071H, 8 pp Jiang Y, Wu Z, Tang J, Li Z, Xue X, Chang S (2018) Modeling multimodal clues in a hybrid deep learning framework for video classification. IEEE Trans Multimedia 20(11):3137–3147 Hou RC, Zhou DM, Nie RC (2020) VIF-Net: An unsupervised framework for infrared and visible image fusion. IEEE Trans Comput Imaging 6:640–651 Ma J, Yu W, Liang P et al (2019) FusionGAN: A generative adversarial network for infrared and visible image fusion. Information Fusion 48:11–26 Ma J, Zhang H, Shao Z, Liang P, Xu H (2021) GANMcC: A generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Trans Instrum Meas 70:1–14 Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets Multiresolution Inf Process 16(3):1850018 Vanmali AV, Gadre VM (2017) Visible and NIR image fusion using weight-map-guided Laplacian–Gaussian pyramid for improving scene visibility. Sādhanā 42(7):1063–1082 Li H, Wu XJ, Kittler J (2021) RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Inf Fusion 73:72–86 Li H, Wu XJ, Durrani TS (2019) Infrared and visible image fusion with ResNet and zero-phase component analysis. Infrared Phys Technol 102:103039 Wang Z (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13:600–612 Li J, Huo HT, Li C, Wang RH, Feng Q (2021) "AttentionFGAN: infrared and visible image fusion using attention-based generative adversarial networks," (in English). IEEE Trans Multimedia 23:1383–1396 Faridul HS, Pouli T, Chamaret C, Stauder J, Reinhard E, Kuzovkin D, Tremeau A (2016) Colour mapping: a review of recent methods, extensions and applications. Comput Graphics Forum 35(1):59–88 A-Monem ME, Hammood TZ (2020) Video colorization methods: a survey. Iraqi J Sci:675–686 Hogervorst MA, Toet A (2010) Fast natural color mapping for night-time imagery. Inf Fusion 11(2):69–77 Reinhard E, Pouli T (2011) Colour spaces for colour transfer. In: Computational Color Imaging - Third International Workshop, CCIW vol. 6626, pp. 1–15 Gómez-Gavara C, Piella G, Vázquez J et al (2021) LIVERCOLOR: An Algorithm Quantification of Liver Graft Steatosis Using Machine Learning and Color Image Processing. HPB 23(supplement 3):S691–S692 Pavlovic R, Petrovic V (2012) Multisensor colour image fusion for night vision. Sensor Signal Processing for Defence, pp. 1–5 Florea L, Florea C (2019) Directed color transfer for low-light image enhancement. Digit Signal Process 93:1–12 Fang Y, Li Y, Tu X, Tan T, Wang X (2020) Face completion with hybrid dilated convolution. Signal Process Image Commun 80:115664 Wang P, Chen P, Yuan Y, Liu D, Cottrell G (2018) Understanding Convolution for Semantic Segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1451–1460 Liu Y, Zhou D, Nie R, Ding Z, Guo Y, Ruan X, Xia W, Hou R (2022) TSE_Fuse: two stage enhancement method using attention mechanism and feature-linking model for infrared and visible image fusion. Digital Signal Process 123:103387 Li H, Wu X-J, Kittler J (2020) MDLatLRR: A novel decomposition method for infrared and visible image fusion. IEEE Trans Image Process 29:4733–4746 Li H, Wu XJ, Kittler J (2018) Infrared and visible image fusion using a deep learning framework. In: International Conference on Pattern Recognition, pp. 2705–2710 Li H, Wu X (2019) DenseFuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28(5):2614–2623 Ding Z, Li H, Zhou D, Li H, Liu Y, Hou R (2021) CMFA_Net: A cross-modal feature aggregation network for infrared-visible image fusion. Infrared Phys Technol 118:103905 Toet A (2014) TNO image fusion dataset. Figshare. Data. [Online]. Available: https://figshare.com/articles/TNimageFusionDataset/1008029. Accessed 26 Apr 2014 INO video dataset. [Online]. Available: https://www.ino.ca/en/videoanalytics-dataset/ Qu G, Zhang D, Yan P (2002) Information measure for performance of image fusion. Electron Lett 38(7):313–315 Wang Q, Shen Y (2004) Performances evaluation of image fusion techniques based on nonlinear correlation measurement. In: Proceedings of the 21st IEEE Instrumentation and Measurement Technology Conference (IEEE Cat. No.04CH37510) Kandadai S, Hardin J, Creusere CD (2008) Audio quality assessment using the mean structural similarity measure. In: IEEE international conference on acoustics Li H, Wu XJ, Durrani T (2020) NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans Instrum Meas 69(12):9645–9656