LAE-GAN: a novel cloud-based Low-light Attention Enhancement Generative Adversarial Network for unpaired text images

Minglong Xue1,2, Yanyi He2, Pengju Xie2, Zuyong He2, Xin Feng2
1National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China
2School of Computer Science and Engineering, Chongqing University of Technology, Chongqing, China

Tóm tắt

AbstractWith the widespread adoption of mobile multimedia devices, the deployment of compute-intensive inference tasks on edge and resource-constrained devices, particularly in the context of low-light text detection, remains a formidable challenge. Existing deep learning approaches have shown limited effectiveness in restoring images for extremely dark scenes. To address these limitations, this paper presents a novel cloud-based Low-light Attention Enhancement Generative Adversarial Network for unpaired text images (LAE-GAN) for the non-paired text image enhancement task in extremely low-light conditions. In the first stage, compressed low-light images are transmitted from edge devices to a cloud server for image enhancement. The LAE-GAN, an end-to-end network comprising a Zero-DCE and AGM-net generator, is designed with a global and local discriminator structure. The initial illumination restoration of extremely low-light images is accomplished using the Zero-DCE network. To enhance text details, we propose an Enhanced Text Attention Mechanism (ETAM) that transforms text information into a comprehensive text attention mechanism across the entire network. The Sobel operator is employed to extract text edge information, while attention is focused on text region details through constraints imposed on the attention map and edge map. Additionally, an AGM-Net module is integrated to reduce noise and fine-tune illumination. In the second stage, the cloud server makes decisions based on user requirements and processes requests in parallel, scaling with the quantity of requests. In the third stage, the enhanced results are transmitted back to edge devices for text detection. Experimental results on widely used LOL and SID low-light datasets demonstrate significant improvements in both quantitative and qualitative analysis, surpassing state-of-the-art enhancement methods in terms of image restoration and text detection.

Từ khóa


Tài liệu tham khảo

Sandhu AK (2021) Big data with cloud computing: Discussions and challenges. Big Data Min Analytics 5(1):32–40

Mousavi SN, Chen F, Abbasi M, Khosravi MR, Rafiee M (2022) Efficient pipelined flow classification for intelligent data processing in iot. Digit Commun Netw 8(4):561–575

Song W, Wu Y, Cui Y, Liu Q, Shen Y, Qiu Z, Yao J, Peng Z (2022) Public integrity verification for data sharing in cloud with asynchronous revocation. Digit Commun Netw 8(1):33–43

Liu Y, Wu H, Rezaee K, Khosravi MR, Khalaf OI, Khan AA, Ramesh D, Qi L (2022) Interaction-enhanced and time-aware graph convolutional network for successive point-of-interest recommendation in traveling enterprises. IEEE Trans Ind Inform 19(1):635–643

Qi L, Liu Y, Zhang Y, Xu X, Bilal M, Song H (2022) Privacy-aware point-of-interest category recommendation in internet of things. IEEE Internet Things J 9(21):21398–21408

Liu Y, Zhou X, Kou H et al (2023) Privacy-Preserving Point-of-Interest Recommendation based on Simplified Graph Convolutional Network for Geological Traveling[J]. ACM Transactions on Intelligent Systems and Technology

Xue M, Huang Z, Liu RZ, Lu T (2021) A Novel Attention Enhanced Residual-In-Residual Dense Network for Text Image Super-Resolution. 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, p. 1–6. https://doi.org/10.1109/ICME51207.2021.9428128

Xue M, Shivakumara P, Zhang C, Lu T, Pal U (2019) Curved text detection in blurred/non-blurred video/scene images. Multimed Tools Appl 78:25629–25653

Chen Y, Zhao F, Lu Y, Chen X (2022) Dynamic task offloading for mobile edge computing with hybrid energy supply. Tsinghua Sci Technol 28(3):421–432

Chen Y, Xing H, Ma Z, Chen X, Huang J (2022) Cost-efficient edge caching for noma-enabled iot services. China Commun

Zhu E, Zhang J, Yan J, Chen K, Gao C (2022) N-gram malgan: Evading machine learning detection via feature n-gram. Digit Commun Netw 8(4):485–491

Zhang S, Yao L, Sun A, Tay Y (2019) Deep learning based recommender system: A survey and new perspectives. ACM Comput Surv (CSUR) 52(1):1–38

Kim J, Lee JK, Lee KM (2016) Accurate Image Super-Resolution Using Very Deep Convolutional Networks. Proceedings of the Ieee Computer Society Conference on Computer Vision and Pattern Recognition V2016-december. p 1646–1654. https://doi.org/10.1109/CVPR.2016.182

Zhang K, Zuo W, Chen Y, Meng D, Zhang L (2017) Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans Image Process 26(7):3142–3155

Tao X, Gao H, Shen X, Wang J, Jia J (2018) Scale-recurrent network for deep image deblurring. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8174–8182

Hsu PH, Lin CT, Ng CC, Kew JL, Tan MY, Lai SH, Chan CS, Zach C (2022) Extremely low-light image enhancement with scene text restoration. In: 2022 26th International Conference on Pattern Recognition (ICPR). IEEE, pp 317–323

Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 8440–8449

Baek J, Kim G, Lee J, Park S, Han D, Yun S, Oh SJ, Lee H (2019) What is wrong with scene text recognition model comparisons? dataset and model analysis. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 4715–4723

Zhang SX, Zhu X, Yang C, Wang H, Yin XC (2021) Adaptive boundary proposal network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 1305–1314

Liao M, Wan Z, Yao C, Chen K, Bai X (2020) Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI conference on artificial intelligence, vol 34. pp 11474–11481

Wei C, Wang W, Yang W, Liu J (2018) Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560

Gharbi M, Chen J, Barron JT, Hasinoff SW, Durand F (2017) Deep bilateral learning for real-time image enhancement. ACM Trans Graph (TOG) 36(4):1–12

Chen C, Chen Q, Xu J, et al (2018) Learning to see in the dark[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 3291-3300.

Jiang Y, Gong X, Liu D, Cheng Y, Fang C, Shen X, Yang J, Zhou P, Wang Z (2021) Enlightengan: Deep light enhancement without paired supervision. IEEE Trans Image Process 30:2340–2349

Guo C, Li C, Guo J, Loy CC, Hou J, Kwong S, Cong R (2020) Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 1780–1789

Cui Z, Li K, Gu L, Su S, Gao P, Jiang Z, Qiao Y, Harada T (2022) Illumination adaptive transformer. arXiv preprint arXiv:2205.14871

Bi R, Liu Q, Ren J, Tan G (2020) Utility aware offloading for mobile-edge computing. Tsinghua Sci Technol 26(2):239–250

Huang J, Lv B, Wu Y, Chen Y, Shen X (2021) Dynamic admission control and resource allocation for mobile edge computing enabled small cell network. IEEE Trans Veh Technol 71(2):1964–1973

Qi L, Lin W, Zhang X, Dou W, Xu X, Chen J (2022) A correlation graph based approach for personalized and compatible web apis recommendation in mobile app development. IEEE Trans Knowl Data Eng

Xu Z, Zhu D, Chen J, Yu B (2022) Splitting and placement of data-intensive applications with machine learning for power system in cloud computing. Digit Commun Netw 8(4):476–484

Mehta R, Sivaswamy J (2017) M-net: A convolutional neural network for deep brain structure segmentation. In: 2017 IEEE 14th international symposium on biomedical imaging (ISBI 2017). IEEE, pp 437–440

Land EH, McCann JJ (1971) Lightness and retinex theory. Josa 61(1):1–11

Li C, Guo C, Han L, Jiang J, Cheng MM, Gu J, Loy CC (2021) Low-light image and video enhancement using deep learning: A survey. IEEE Trans Pattern Anal Mach Intell 44(12):9396–9416

Lore KG, Akintayo A, Sarkar S (2017) Llnet: A deep autoencoder approach to natural low-light image enhancement. Pattern Recogn 61:650–662

Laghari AA, Jumani AK, Laghari RA (2021) Review and state of art of fog computing. Arch Comput Methods Eng 28(5):3631–3643

Satyanarayanan M (2017) The emergence of edge computing. Computer 50(1):30–39

Sun X, Ansari N (2016) Edgeiot: Mobile edge computing for the internet of things. IEEE Commun Mag 54(12):22–29

Wang R, Tsai WT, He J, Liu C, Li Q, Deng E (2019) A video surveillance system based on permissioned blockchains and edge computing. In: 2019 IEEE international conference on big data and smart computing (BigComp). IEEE, pp 1–6

Chen J, Li K, Deng Q, Li K, Philip SY (2019) Distributed deep learning model for intelligent video surveillance systems with edge computing. IEEE Trans Ind Inform

Chen C, Liu B, Wan S, Qiao P, Pei Q (2020) An edge traffic flow detection scheme based on deep learning in an intelligent transportation system. IEEE Trans Intell Transp Syst 22(3):1840–1852

Wan S, Ding S, Chen C (2022) Edge computing enabled video segmentation for real-time traffic monitoring in internet of vehicles. Pattern Recogn 121:108146

Chen C, Liu L, Wan S, Hui X, Pei Q (2021) Data dissemination for industry 4.0 applications in internet of vehicles based on short-term traffic prediction. ACM Trans Internet Technol (TOIT) 22(1):1–18

Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision. pp 2223–2232

Wang T, Sun M, Hu K (2017) Dilated deep residual network for image denoising. In: 2017 IEEE 29th international conference on tools with artificial intelligence (ICTAI). IEEE, pp 1272–1279

Yuan Q, Zhang Q, Li J, Shen H, Zhang L (2018) Hyperspectral image denoising employing a spatial-spectral deep residual convolutional neural network. IEEE Trans Geosci Remote Sens 57(2):1205–1218

Kim JY, Kim LS, Hwang SH (2001) An advanced contrast enhancement using partially overlapped sub-block histogram equalization. IEEE Trans Circ Syst Video Technol 11(4):475–484

Zhang C, Shivakumara P, Xue M, Zhu L, Lu T, Pal U (2018) New fusion based enhancement for text detection in night video footage. In: Advances in Multimedia Information Processing–PCM 2018: 19th Pacific-Rim Conference on Multimedia, Hefei, China, September 21-22, 2018, Proceedings, Part III 19. Springer, pp 46–56

Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241

Jolicoeur-Martineau A (2018) The relativistic discriminator: a key element missing from standard gan. arXiv preprint arXiv:1807.00734

Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision. pp 2794–2802