Optimal transport-based unsupervised semantic disentanglement: A novel approach for efficient image editing in GANs

Displays - Tập 80 - Trang 102560 - 2023
Yunqi Liu1, Xue Ouyang2, Tian Jiang1, Hongwei Ding1, Xiaohui Cui1
1Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, China
2State Key Laboratory of Information Engineering in Surveying, Wuhan University, Wuhan, China

Tài liệu tham khảo

Z. Liu, M. Li, Y. Zhang, C. Wang, Q. Zhang, J. Wang, Y. Nie, Fine-Grained Face Swapping via Regional GAN Inversion, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 8578–8587. H. Pehlivan, Y. Dalva, A. Dundar, Styleres: Transforming the residuals for real image editing with stylegan, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1828–1837. Ding, 2022, Imbalanced data classification: A KNN and generative adversarial networks-based hybrid approach for intrusion detection, Future Gener. Comput. Syst., 131, 240, 10.1016/j.future.2022.01.026 Song, 2023, Discriminator feature-based progressive GAN inversion, Knowl.-Based Syst., 261, 10.1016/j.knosys.2022.110186 Yang, 2021, Semantic hierarchy emerges in deep generative representations for scene synthesis, Int. J. Comput. Vis., 129, 1451, 10.1007/s11263-020-01429-5 D. Jiang, D. Song, R. Tong, M. Tang, StyleIPSB: Identity-Preserving Semantic Basis of StyleGAN for High Fidelity Face Swapping, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 352–361. Shen, 2020, Interfacegan: Interpreting the disentangled face representation learned by gans, IEEE Trans. Pattern Anal. Mach. Intell. Voynov, 2020, Unsupervised discovery of interpretable directions in the gan latent space, 9786 H. Yang, L. Chai, Q. Wen, S. Zhao, Z. Sun, S. He, Discovering interpretable latent space directions of gans beyond binary attributes, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12177–12185. Jäger, 2023, Universal expressiveness of variational quantum classifiers and quantum kernels for support vector machines, Nature Commun., 14, 576, 10.1038/s41467-023-36144-5 Abdal, 2021, Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows, ACM Trans. Graph., 40, 1, 10.1145/3447648 Y. Jiang, Z. Huang, X. Pan, C.C. Loy, Z. Liu, Talk-to-Edit: Fine-Grained Facial Editing via Dialog, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13799–13808. Zhu, 2021, Low-rank subspaces in GANs Liu, 2022, Towards spatially disentangled manipulation of face images with pre-trained StyleGANs, IEEE Trans. Circuits Syst. Video Technol., 1 Y. Shen, B. Zhou, Closed-form factorization of latent semantics in gans, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1532–1540. Zhang, 2021, AP-GAN: Improving attribute preservation in video face swapping, IEEE Trans. Circuits Syst. Video Technol., PP, 1 T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, T. Aila, Analyzing and improving the image quality of stylegan, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8110–8119. Wu, 2022, Adversarial UV-transformation texture estimation for 3D face aging, IEEE Trans. Circuits Syst. Video Technol., 32, 4338, 10.1109/TCSVT.2021.3133313 Fahim Sikder, 2020, Bangla handwritten digit recognition and generation, 547 Ji, 2019 S. Khodadadeh, S. Ghadar, S. Motiian, W.A. Lin, L. Bölöni, R. Kalarot, Latent to Latent: A Learned Mapper for Identity Preserving Editing of Multiple Face Attributes in StyleGAN-Generated Images, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 3184–3192. Y. Wang, Y. Hu, J. Yu, J. Zhang, Gan prior based null-space learning for consistent super-resolution, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37, No. 3, 2023, pp. 2724–2732. Brock, 2018, Large scale GAN training for high fidelity natural image synthesis D. An, AE-OT: A New Generative Model Based on Extended Semi-Discrete Optimal Transport, in: Proceedings of the 8th International Conference on Learning Representations, 2020. Mondino, 2022, An optimal transport formulation of the Einstein equations of general relativity, J. Eur. Math. Soc., 25, 933, 10.4171/JEMS/1188 Eckstein, 2023, Convergence rates for regularized optimal transport via quantization, Math. Oper. Res., 10.1287/moor.2022.0245 Redko, 2019, Optimal transport for multi-source domain adaptation under target shift, 849 Bonneel, 2023, A survey of optimal transport for computer graphics and computer vision, 439 Taşkesen, 2023, Semi-discrete optimal transport: Hardness, regularization and numerical solution, Math. Program., 199, 1033, 10.1007/s10107-022-01856-x Karras, 2021, Alias-free generative adversarial networks, Adv. Neural Inf. Process. Syst., 34, 852 T. Karras, S. Laine, T. Aila, A style-based generator architecture for generative adversarial networks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4401–4410. P. Isola, J.Y. Zhu, T. Zhou, A.A. Efros, Image-to-image translation with conditional adversarial networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1125–1134. Y. Lu, Y.W. Tai, C.K. Tang, Attribute-guided face generation using conditional cyclegan, in: Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 282–297. Xia, 2022, Gan inversion: A survey, IEEE Trans. Pattern Anal. Mach. Intell., 10.1109/TPAMI.2022.3181070 Y. Alaluf, O. Patashnik, D. Cohen-Or, Restyle: A residual-based stylegan encoder via iterative refinement, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 6711–6720. Wang, 2021 Zhu, 2020, In-domain gan inversion for real image editing, 592 A. Cherepkov, A. Voynov, A. Babenko, Navigating the gan parameter space for semantic image editing, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3671–3680. Lei, 2020, A geometric understanding of deep learning, Engineering, 6, 361, 10.1016/j.eng.2019.09.010 Gu, 2021, Optimal transport for generative models, 1 Mi, 2023, WGAN-CL: A Wasserstein GAN with confidence loss for small-sample augmentation, Expert Syst. Appl., 233, 10.1016/j.eswa.2023.120943 Villani, 2009 Z. Zeng, S. Zhang, Y. Xia, H. Tong, PARROT: Position-Aware Regularized Optimal Transport for Network Alignment, in: Proceedings of the ACM Web Conference 2023, 2023, pp. 372–382. Cuturi, 2013, Sinkhorn distances: Lightspeed computation of optimal transport, Adv. Neural Inf. Process. Syst., 26 Paul, 2023, Robust principal component analysis: A median of means approach, IEEE Trans. Neural Netw. Learn. Syst., 10.1109/TNNLS.2023.3298011 T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive Growing of GANs for Improved Quality, Stability, and Variation, in: Proceedings of the International Conference on Learning Representations, 2018. J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, L. Fei-Fei, Imagenet: A large-scale hierarchical image database, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255. Branwen, 2019 Yu, 2015 Härkönen, 2020, Ganspace: Discovering interpretable gan controls, Adv. Neural Inf. Process. Syst., 33, 9841 Mishra, 2023, NeuroGAN: image reconstruction from EEG signals via an attention-based GAN, Neural Comput. Appl., 35, 9181 Nadimpalli, 2023, ProActive DeepFake detection using GAN-based visible watermarking, ACM Trans. Multimed. Comput. Commun. Appl., 10.1145/3625547 Serengil, 2021, HyperExtended LightFace: A facial attribute analysis framework, 1