Detecting facial manipulated images via one-class domain generalization

Springer Science and Business Media LLC - Tập 30 - Trang 1-14 - 2024
Pengxiang Xu1,2, Zhiyuan Ma1, Xue Mei1, jie Shen1
1College of Electrical Engineering and Control Science, Nanjing Tech University, Nanjing, China
2School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China

Tóm tắt

Nowadays, numerous synthesized images and videos generated by facial manipulated techniques have become an emerging problem, which promotes facial manipulation detection to be a significant topic. Much concern about the use of synthesized facial digital contents in society is rising due to their deceptive nature and widespread. To detect such manipulated facial digital contents, many methods have been proposed. Most detection methods focus on specific datasets. It is hard for them to detect facial images or videos manipulated by unknown face synthesis algorithms. In this paper, we propose a method to improve the generalization ability of the facial manipulation detection model using one-class domain generalization. We shape the problem into domain generalization. We divide the dataset into several domains according to different manipulation algorithms. We also try to process the images from the perspective of frequency domain. We utilize two-dimensional wavelet transform to preprocess the images to ensure the effect on compressed images. The results of experiments implemented on FaceForensics++ dataset exceed the baselines and recent works. The feature visualization analyses intuitively show that our method can learn robust feature representation that can be generalized to unseen domains.

Tài liệu tham khảo

Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018). https://doi.org/10.1109/ACCESS.2018.2807385 Aneja, S., Nießner, M.: Generalized zero and few-shot transfer for facial forgery detection, (2020). arXiv preprint arXiv:2006.11863. Accessed 30 Nov 2021 Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195 Chu, X., Jin, Y., Zhu, W., Wang, Y., Wang, X., Zhang, S., Mei, H.: DNA: domain generalization with diversified neural averaging. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, pp. 4010–4034 (2022) Dang, H., Liu, F., Stehouwer, J., Liu, X., Jain, A.K.: On the detection of digital face manipulation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5780–5789 (2020). https://doi.org/10.1109/CVPR42600.2020.00582 Dolhansky, B., Howes, R., Pflaum, B., Baram, N., Ferrer, C.C.: The deepfake detection challenge (dfdc) preview dataset (2019). arXiv preprint arXiv:1910.08854. Accessed 30 Nov 2021 Durall, R., Keuper, M., Keuper, J.: Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7887–7896 (2020). https://doi.org/10.1109/CVPR42600.2020.00791 Frank, J., Eisenhofer, T., Schönherr, L., Fischer, A., Kolossa, D., Holz, T.: Leveraging frequency analysis for deep fake image recognition. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 3247–3258 (2020) Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 1180–1189 (2015) Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 2096–2130 (2016) Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness, (2018). arXiv preprint arXiv:1811.12231. Accessed 15 Sept 2022 Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Advances in Neural Information Processing Systems. 27, 1–9 (2014) Huang, Y., Juefei-Xu, F., Wang, R., Xie, X., Ma, L., Li, J., Miao, W., Liu, Y., Pu, G.: Fakelocator: robust localization of gan-based face manipulations via semantic segmentation networks with bells and whistles, (2020). arXiv preprint arXiv:2001.09598. Accessed 15 Sept 2022 Jung, T., Kim, S., Kim, K.: Deepvision: deepfakes detection using human eye blinking pattern. IEEE Access 8, 83144–83154 (2020). https://doi.org/10.1109/ACCESS.2020.2988660 Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation, (2017). arXiv preprint arXiv:1710.10196. Accessed 20 Oct 2022 Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4217–4228 (2021). https://doi.org/10.1109/TPAMI.2020.2970919 Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of stylegan. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8107–8116 (2020). https://doi.org/10.1109/CVPR42600.2020.00813 Kim, D.-K., Kim, D., Kim, K.: Facial manipulation detection based on the color distribution analysis in edge region, (2021). arXiv preprint arXiv:2102.01381. Accessed 20 Oct 2022 King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009) Kowalski, M.: Faceswap, (2018). https://github.com/marekkowalski/faceswap Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S.: Celeb-df: a large-scale challenging dataset for deepfake forensics. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3204–3213 (2020). https://doi.org/10.1109/CVPR42600.2020.00327 Liu, D., Dang, Z., Peng, C., Zheng, Y., Li, S., Wang, N., Gao, X.: Fedforgery: generalized face forgery detection with residual federated learning, (2022). arXiv preprint arXiv:2210.09563. Accessed 20 Oct 2022 Liu, Z., Qi, X., Torr, P.H.S.: Global texture enhancement for fake face detection in the wild. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8057–8066 (2020). https://doi.org/10.1109/CVPR42600.2020.00808 Matern, F., Riess, C., Stamminger, M.: Exploiting visual artifacts to expose deepfakes and face manipulations. In: 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 83–92 (2019). https://doi.org/10.1109/WACVW.2019.00020 Pu, Y., Gan, Z., Henao, R., Yuan, X., Li, C., Stevens, A., Carin, L.: Variational autoencoder for deep learning of images, labels and captions. Advances in Neural Information Processing Systems. 29, 1–9 (2016) Rangwani, H., Aithal, S.K., Mishra, M., Jain, A., Radhakrishnan, V.B.: A closer look at smoothness in domain adversarial training. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, pp. 18378–18399, (2022) Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Niessner, M.: Faceforensics++: learning to detect manipulated facial images. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1–11 (2019). https://doi.org/10.1109/ICCV.2019.00009 Sun, K., Liu, H., Ye, Q., Gao, Y., Liu, J., Shao, L., Ji, R.: Domain general face forgery detection by learning to weight. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 2638–2646 (2021). https://doi.org/10.1609/aaai.v35i3.16367 Suwajanakorn, S., Seitz, S.M., Kemelmacher-Shlizerman, I.: Synthesizing obama: learning lip sync from audio. ACM Trans. Graph. (2017). https://doi.org/10.1145/3072959.3073640 Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. 10(1145/3306346), 3323035 (2019) Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2face: real-time face capture and reenactment of rgb videos. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2387–2395 (2016). https://doi.org/10.1109/CVPR.2016.262 Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., Ortega-Garcia, J.: Deepfakes and beyond: a survey of face manipulation and fake detection. Inf. Fusion 64, 131–148 (2020). https://doi.org/10.1016/j.inffus.2020.06.014 Tora, M.: Deepfakes, (2018). https://github.com/deepfakes/faceswap/tree/v2.0.0 Tran, V.-N., Kwon, S.-G., Lee, S.-H., Le, H.-S., Kwon, K.-R.: Generalization of forgery detection with meta deepfake detection model. IEEE Access 11, 535–546 (2023). https://doi.org/10.1109/ACCESS.2022.3232290 Van Der Maaten, L.: Barnes-hut-sne, (2013). arXiv preprint arXiv:1710.10196. Accessed 15 Apr 2021 Yu, P., Fei, J., Xia, Z., Zhou, Z., Weng, J.: Improving generalization by commonality learning in face forgery detection. IEEE Trans. Inf. Forensics Secur. 17, 547–558 (2022). https://doi.org/10.1109/TIFS.2022.3146781 Zhang, X., Karaman, S., Chang, S.-F.: Detecting and simulating artifacts in gan fake images. In: 2019 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6, (2019). https://doi.org/10.1109/WIFS47025.2019.9035107 Zhang, X., Wang, S., Liu, C., Zhang, M., Liu, X., Xie, H.: Thinking in patch: towards generalizable forgery detection with patch transformation. In: Duc Nghia Pham, Thanaruk Theeramunkong, Guido Governatori, and Fenrong Liu, editors, PRICAI 2021: Trends in Artificial Intelligence, pp. 337–352, (2021). https://doi.org/10.1007/978-3-030-89370-5_25 Zhao, H., Wei, T., Zhou, W., Zhang, W., Chen, D., Yu, N.: Multi-attentional deepfake detection. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2185–2194, (2021). https://doi.org/10.1109/CVPR46437.2021.00222 Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Two-stream neural networks for tampered face detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1831–1839, (2017). https://doi.org/10.1109/CVPRW.2017.229 Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2242–2251, (2017). https://doi.org/10.1109/ICCV.2017.244 Zhuang, W., Chu, Q., Tan, Z., Liu, Q., Yuan, H., Miao, C., Luo, Z., Yu, N.: Uia-vit: unsupervised inconsistency-aware method based on vision transformer for face forgery detection. In: S. Avidan, G. Brostow, M. Cissé, G. Maria Farinella, T. Hassner, editors, Computer Vision – ECCV 2022, pp. 391–407, (2022). https://doi.org/10.1007/978-3-031-20065-6_23 Zhuang, W., Chu, Q., Yuan, H., Miao, C., Liu, B., Yu, N.: Towards intrinsic common discriminative features learning for face forgery detection using adversarial learning. In: 2022 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6, (2022). https://doi.org/10.1109/ICME52920.2022.9859586