Multi-feature contrastive learning for unpaired image-to-image translation
Tóm tắt
Unpaired image-to-image translation for the generation field has made much progress recently. However, these methods suffer from mode collapse because of the overfitting of the discriminator. To this end, we propose a straightforward method to construct a contrastive loss using the feature information of the discriminator output layer, which is named multi-feature contrastive learning (MCL). Our proposed method enhances the performance of the discriminator and solves the problem of model collapse by further leveraging contrastive learning. We perform extensive experiments on several open challenge datasets. Our method achieves state-of-the-art results compared with current methods. Finally, a series of ablation studies proved that our approach has better stability. In addition, our proposed method is also practical for single image translation tasks. Code is available at
https://github.com/gouayao/MCL.
Tài liệu tham khảo
Baek K, Choi Y, Uh Y, Yoo J, Shim H (2021) Rethinking the truly unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 14154–14163
Benaim S, Wolf (2017) One-sided unsupervised domain mapping. In: NIPS, pp. 752–762 http://papers.nips.cc/paper/6677-one-sided-unsupervised-domain-mapping
Chaitanya B, Mukherjee S (2021) Single image dehazing using improved cyclegan. J Vis Commun Image Represent 74:103014
Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1511–1520
Chen J, Chen J, Chao H, Yang M (2018) Image blind denoising with generative adversarial network based noise modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3155–3164
Chen T, Kornblith S, Norouzi M, Hinton GE (2020) A simple framework for contrastive learning of visual representations. CoRR. arXiv:2002.05709
Choi Y, Uh Y, Yoo J, Ha JW Stargan (2020) v2: Diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8188–8197
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3223
Dash A, Ye J, Wang G (2021) A review of generative adversarial networks (gans) and its applications in a wide variety of disciplines—from medical to remote sensing
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 https://doi.org/10.1109/CVPR.2009.5206848
Fu H, Gong M, Wang C, Batmanghelich K, Zhang K, Tao D (2019) Geometry-consistent generative adversarial networks for one-sided unsupervised domain mapping. In: CVPR, pp. 2427–2436
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414–2423
GM, H., Gourisaria, M.K., Pandey, M., Rautaray, S.S. (2020) A comprehensive survey and analysis of generative models in machine learning. Comput Sci Rev 38:100285
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: NIPS, pp. 2672–2680 . http://papers.nips.cc/paper/5423-generative-adversarial-nets
Gutmann M, Hyvärinen A (2010) Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In: Y.W. Teh, M. Titterington (eds.) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 9, pp. 297–304. PMLR, Chia Laguna Resort, Sardinia, Italy https://proceedings.mlr.press/v9/gutmann10a.html
Han J, Shoeiby M, Petersson L, Armin MA (2021) Dual contrastive learning for unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 746–755
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9729–9738
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS, pp. 6629–6640 http://papers.nips.cc/paper/7240-gans-trained-by-a-two-time-scale-update-rule-converge-to-a-local-nash-equilibrium
Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 172–189
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1125–1134
Jeong J, Shin J (2021) Training gans with stronger augmentations via contrastive discriminator. In: International Conference on Learning Representations . https://openreview.net/forum?id=eo6U4CAwVmg
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision - ECCV 2016. Springer International Publishing, Cham, pp 694–711
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8110–8119
Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: D. Precup, Y.W. Teh (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70, pp. 1857–1865. PMLR https://proceedings.mlr.press/v70/kim17a.html
Kolkin N, Salavon J, Shakhnarovich G (2019) Style transfer by relaxed optimal transport and self-similarity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10051–10060
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4681–4690
Lee HY, Tseng HY, Huang JB, Singh M, Yang, MH (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 35–51
Li R, Pan J, Li Z, Tang J (2018) Single image dehazing via conditional generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8202–8211
Li T, Qian R, Dong C, Liu S, Yan Q, Zhu W, Lin L (2018) Beautygan: Instance-level facial makeup transfer with deep generative adversarial network. In: Proceedings of the 26th ACM International Conference on Multimedia, MM ’18, p. 645-653. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3240508.3240618
Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Guyon I, Luxburg UV, . Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/dc6a6489640ca02b0d42dabeb8e46bb7-Paper.pdf
Liu R, Ge Y, Choi CL, Wang X, Li H (2021). Divco: Diverse conditional image synthesis via contrastive generative adversarial network. CoRR. arXiv:2103.07893
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440
Mao X, Li Q, Xie H, Lau RY, Wang Z, Smolley SP (2017) Least squares generative adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2813–2821. https://doi.org/10.1109/ICCV.2017.304
Monday HN, Li J, Nneji GU, Nahar S, Hossin MA, Jackson J, Oluwasanmi A (2022) A wavelet convolutional capsule network with modified super resolution generative adversarial network for fault diagnosis and classification. Complex Intell Syst: 1–17
Park T, Liu MY, Wang TC, Zhu JY (2019) Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2337–2346
Park T, Efros AA, Zhang R, Zhu JY (2020) Contrastive learning for unpaired image-to-image translation. In: Vedaldi A, Bischof H, Brox T, Frahm JM (eds) Computer Vision - ECCV 2020. Springer International Publishing, Cham, pp 319–345
Salehi P, Chalechale A, Taghizadeh M (2020) Generative adversarial networks (gans): An overview of theoretical model, evaluation metrics, and recent developments. CoRR. arXiv:2005.13178
van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. CoRR. arXiv:1807.03748
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8798–8807
Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Qiao Y, Change Loy C (2018) Esrgan: Enhanced super-resolution generative adversarial networks. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops
Wang C, Zheng H, Yu Z, Zheng Z, Gu Z, Zheng B (2018) Discriminative region proposal adversarial networks for high-quality image-to-image translation. In: Proceedings of the European Conference on Computer Vision (ECCV)
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3733–3742
Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2849–2857
Yoo J, Uh Y, Chun S, Kang B, Ha JW (2019) Photorealistic style transfer via wavelet transforms. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9036–9045
Yu F, Koltun V, Funkhouser T (2017) Dilated residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 472–480
Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision - ECCV 2016. Springer International Publishing, Cham, pp 649–666
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2223–2232