Disentangled generation network for enlarged license plate recognition and a unified dataset

Computer Vision and Image Understanding - Tập 238 - Trang 103880 - 2024
Chenglong Li1,2,3, Xiaobin Yang2,4, Guohao Wang2,4, Aihua Zheng1,2,3, Chang Tan5, Jin Tang1,2,4
1Information Materials and Intelligent Sensing Laboratory of Anhui Province, Hefei, 230601, China
2Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Hefei, 230601, China
3School of Artificial Intelligence, Anhui University, Hefei 230601, China
4School of Computer Science and Technology, Anhui University, Hefei 230601, China
5iFLYTEK Co., Ltd., Hefei, 230088, China

Tài liệu tham khảo

Arjovsky, 2017 Azadi, S., Fisher, M., Kim, V.G., Wang, Z., Shechtman, E., Darrell, T., 2018. Multi-content gan for few-shot font style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 7564–7573. Baek, J., Kim, G., Lee, J., Park, S., Han, D., Yun, S., Oh, S.J., Lee, H., 2019. What is wrong with scene text recognition model comparisons? dataset and model analysis. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 4715–4723. Bińkowski, 2018 Brock, 2018 Chen, R., Huang, W., Huang, B., Sun, F., Fang, B., 2020. Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Cheng, 2019, Structure-preserving neural style transfer, IEEE Trans. Image Process., 29, 909, 10.1109/TIP.2019.2936746 Cheng, Z., Xu, Y., Bai, F., Niu, Y., Pu, S., Zhou, S., 2018. Aon: Towards arbitrarily-oriented text recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 5571–5579. Choi, Y., Choi, M., Kim, M., Ha, J.-W., Kim, S., Choo, J., 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 8789–8797. Emami, 2020, Spa-gan: Spatial attention gan for image-to-image translation, IEEE Trans. Multimed., 23, 391, 10.1109/TMM.2020.2975961 Gong, 2022, Unified Chinese license plate detection and recognition with high efficiency, J. Vis. Commun. Image Represent., 86, 10.1016/j.jvcir.2022.103541 Gou, 2015, Vehicle license plate recognition based on extremal regions and restricted Boltzmann machines, IEEE Trans. Intell. Transp. Syst., 17, 1096, 10.1109/TITS.2015.2496545 Guo, 2008, License plate localization and character segmentation with feedback self-learning and hybrid binarization techniques, IEEE Trans. Veh. Technol., 57, 1417, 10.1109/TVT.2007.909284 Gupta, A., Vedaldi, A., Zisserman, A., 2016. Synthetic data for text localisation in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2315–2324. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S., 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of the Advances in Neural Information Processing Systems. pp. 6626–6637. Huang, X., Liu, M.-Y., Belongie, S., Kautz, J., 2018. Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision. pp. 172–189. Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A., 2017. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1125–1134. Karras, T., Laine, S., Aila, T., 2019. A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4401–4410. Lee, 2020, DRIT++: Diverse image-to-image translation viaDisentangled representations, Int. J. Comput. Vis., 1 Li, 2016 Li, H., Wang, P., Shen, C., Zhang, G., 2019. Show, attend and read: A simple and strong baseline for irregular text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. pp. 8610–8617. Liao, M., Zhang, J., Wan, Z., Xie, F., Liang, J., Lyu, P., Yao, C., Bai, X., 2019. Scene text recognition from two-dimensional perspective. In: Proceedings of the AAAI Conference on Artificial Intelligence. pp. 8714–8721. Litman, R., Anschel, O., Tsiper, S., Litman, R., Mazor, S., Manmatha, R., 2020. SCATTER: Selective context attentional scene text recognizer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 11962–11972. Luo, 2019, Moran: A multi-object rectified attention network for scene text recognition, Pattern Recognit., 90, 109, 10.1016/j.patcog.2019.01.020 Luo, 2021, Separating content from style using adversarial learning for recognizing text in the wild, Int. J. Comput. Vis., 129, 960, 10.1007/s11263-020-01411-1 Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S., 2017. Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2794–2802. Ning, 2021, Disentangled representation learning for cross-modal biometric matching, IEEE Trans. Multimed., 24, 1763, 10.1109/TMM.2021.3071243 Rasheed, S., Naeem, A., Ishaq, O., 2012. Automated number plate recognition using hough lines and template matching. In: Proceedings of the World Congress on Engineering and Computer Science. pp. 24–26. Shao, 2021 Sun, Y.-F., Liu, Q., Chen, S.-L., Zhou, F., Yin, X.-C., 2021. Robust Chinese license plate generation via foreground text and background separation. In: Proceedings of the International Conference on Image and Graphics. pp. 290–302. Tao Wen, 2022, Detection and recognition method of enlarged license plate based on combined vision and rule evaluation, J. Chinese Comput. Syst., 8, 1697 Wang, 2017 Wen, L., Gao, C., Zou, C., 2023. CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 18300–18309. Wen, 2011, An algorithm for license plate recognition applied to intelligent transportation system, IEEE Trans. Intell. Transp. Syst., 12, 830, 10.1109/TITS.2011.2114346 Xu, Z., Yang, W., Meng, A., Lu, N., Huang, H., Ying, C., Huang, L., 2018. Towards end-to-end license plate detection and recognition: A large dataset and baseline. In: Proceedings of the European Conference on Computer Vision. pp. 255–271. Yang, M., Guan, Y., Liao, M., He, X., Bian, K., Bai, S., Yao, C., Bai, X., 2019a. Symmetry-constrained rectification network for scene text recognition. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 9147–9156. Yang, X., He, D., Zhou, Z., Kifer, D., Giles, C.L., 2017. Learning to Read Irregular Text with Attention Mechanisms. In: Proceedings of the International Joint Conference on Artificial Intelligence. pp. 3280–3286. Yang, S., Wang, Z., Wang, Z., Xu, N., Liu, J., Guo, Z., 2019b. Controllable artistic text style transfer via shape-matching gan. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 4442–4451. Yi, 2020, BSD-GAN: Branched generative adversarial network for scale-disentangled representation learning and image synthesis, IEEE Trans. Image Process., 29, 9073, 10.1109/TIP.2020.3014608 Yue, 2020, Robustscanner: Dynamically enhancing positional clues for robust text recognition, 135 Yun, 2021, Instance GNN: a learning framework for joint symbol segmentation and recognition in online handwritten diagrams, IEEE Trans. Multimed., 24, 2580, 10.1109/TMM.2021.3087000 Zhang, Y., Huang, N., Tang, F., Huang, H., Ma, C., Dong, W., Xu, C., 2023. Inversion-based style transfer with diffusion models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 10146–10156. Zhang, 2020, A robust attentional framework for license plate recognition in the wild, IEEE Trans. Intell. Transp. Syst., 22, 6967, 10.1109/TITS.2020.3000072 Zhu, J.-Y., Park, T., Isola, P., Efros, A.A., 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In: Proceedings of the IEEE International Conference on Computer Vision.