Backdoor Pony: Evaluating backdoor attacks and defenses in different domains

SoftwareX - Tập 22 - Trang 101387 - 2023
Arthur Mercier1,2, Nikita Smolin2, Oliver Sihlovec2, Stefanos Koffas2, Stjepan Picek3,2
1Van Mourik Broekmanweg 6, 2628 XE Delft, The Netherlands
2Cybersecurity Group, Delft University of Technology, The Netherlands
3Digital Security Group, Radboud University, The Netherlands

Tài liệu tham khảo

Phillips, 2014, Comparison of human and computer performance across face recognition experiments, Image Vis Comput, 32, 74, 10.1016/j.imavis.2013.12.002 Chen, 2018, Stock prediction using convolutional neural network, IOP Conf Ser: Mater Sci Eng, 435 Yang, 2018 Bojarski, 2017 Gu, 2017 Li, 2022 Wu, 2022, BackdoorBench: A comprehensive benchmark of backdoor learning Pang R, Zhang Z, Gao X, Xi Z, Ji S, Cheng P, et al. TrojanZoo: Towards Unified, Holistic, and Practical Evaluation of Neural Backdoors. In: Proceedings of IEEE european symposium on security and privacy. 2022. Nicolae, 2018 Cui G, Yuan L, He B, Chen Y, Liu Z, Sun M. A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks. In: Proceedings of neurIPS: datasets and benchmarks. 2022. Tinghao, 2022 Turner A, Tsipras D, Mdry A. Clean-label backdoor attacks. MIT, URL. Chen, 2017 Chen X, Salem A, Backes M, Ma S, Zhang Y. Badnl: Backdoor attacks against nlp models. In: ICML 2021 workshop on adversarial machine learning. 2021. Qi, 2020 Gao, 2021, Design and evaluation of a multi-domain Trojan detection method on deep neural networks, IEEE Trans Dependable Secure Comput, 19, 2349, 10.1109/TDSC.2021.3055844 Liu, 2020, Reflection backdoor: A natural backdoor attack on deep neural networks, 182 Zhao S, Ma X, Zheng X, Bailey J, Chen J, Jiang Y-G. Clean-label backdoor attacks on video recognition models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020, p. 14443–52. Souri, 2021 Li Y, Li Y, Wu B, Li L, He R, Lyu S. Invisible backdoor attack with sample-specific triggers. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021, p. 16463–72. Nguyen, 2021 Bagdasaryan E, Shmatikov V. Blind backdoors in deep learning models. In: 30th USENIX security symposium. 2021, p. 1505–21. Nguyen, 2020, Input-aware dynamic backdoor attack, Adv Neural Inf Process Syst, 33, 3454 Li, 2021 Doan K, Lao Y, Zhao W, Li P. Lira: Learnable, imperceptible and robust backdoor attacks. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021, p. 11966–76. Liu, 2017, Neural Trojans, 45 Liu, 2018, Fine-pruning: Defending against backdooring attacks on deep neural networks, 273 Zhao, 2020 Li, 2021 Li, 2021, Anti-backdoor learning: Training clean models on poisoned data, Adv Neural Inf Process Syst, 34, 14900 Tang R, Du M, Liu N, Yang F, Hu X. An embarrassingly simple approach for trojan attack in deep neural networks. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 2020, p. 218–28. Liu, 2018, Trojaning attack on neural networks Yao Y, Li H, Zheng H, Zhao BY. Latent backdoor attacks on deep neural networks. In: Proceedings of the 2019 ACM SIGSAC conference on computer and communications security. 2019, p. 2041–55. Shokri, 2020, Bypassing backdoor detection algorithms in deep learning, 175 Pang R, Shen H, Zhang X, Ji S, Vorobeychik Y, Luo X, et al. A tale of evil twins: Adversarial inputs versus poisoned models. In: Proceedings of the 2020 ACM SIGSAC conference on computer and communications security. 2020, p. 85–99. Cohen, 2019, Certified adversarial robustness via randomized smoothing, 1310 Xu, 2017 Madry, 2017 Meng D, Chen H. Magnet: A two-pronged defense against adversarial examples. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 2017, p. 135–47. Chen, 2018 Tran, 2018, Spectral signatures in backdoor attacks, Adv Neural Inf Process Syst, 31 Gao, 2019 Udeshi, 2022, Model agnostic defence against backdoor attacks in machine learning, IEEE Trans Reliab, 71, 880, 10.1109/TR.2022.3159784 Wang, 2019, Neural cleanse: Identifying and mitigating backdoor attacks in neural networks, 707 Chen, 2019, DeepInspect: A black-box Trojan detection and mitigation framework for deep neural networks, 8 Guo, 2019 Huang, 2019 Dai, 2019, A backdoor attack against lstm-based text classification systems, IEEE Access, 7, 138872, 10.1109/ACCESS.2019.2941376 Qi, 2021 Qi, 2021 Shen, 2021 Zhang, 2021, Trojaning language models for fun and profit, 179 Yang W, Lin Y, Li P, Zhou J, Sun X. Rethinking stealthiness of backdoor attack against nlp models. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers). 2021, p. 5543–57. Li, 2021 Yang, 2021 Zhang, 2023, Red alarm for pre-trained models: Universal vulnerability to neuron-level backdoor attacks, Mach Intell Res, 1 Qi, 2021 Kurita, 2020 Yang, 2021 Chen, 2021, Mitigating backdoor attacks in lstm-based text classification systems by backdoor keyword identification, Neurocomputing, 452, 253, 10.1016/j.neucom.2021.04.105 Barni, 2019, A new backdoor attack in cnns by training set corruption without label poisoning, 101 Zeng Y, Park W, Mao ZM, Jia R. Rethinking the backdoor attacks’ triggers: A frequency perspective. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021, p. 16473–81. Wu, 2021, Adversarial neuron pruning purifies backdoored deep models, Adv Neural Inf Process Syst, 34, 16913 Huang, 2022 Tang D, Wang X, Tang H, Zhang K. Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. In: USENIX security symposium. 2021, p. 1541–58. Qi X, Xie T, Li Y, Mahloujifar S, Mittal P. Revisiting the assumption of latent separability for backdoor defenses. In: The eleventh international conference on learning representations. 2023. Hayase, 2021, Spectre: Defending against backdoor attacks using robust statistics, 4129 Chou, 2020, Sentinet: Detecting localized universal attacks against deep learning systems, 48 Kwon, 2021, Defending deep neural networks against backdoor attack by using de-trigger autoencoder, IEEE Access, 1 Zeng, 2021 Zhai, 2021, Backdoor attack against speaker verification, 2560 Chen, 2021, BadNL: Backdoor attacks against NLP models with semantic-preserving improvements Zhang, 2020 Xi, 2021, Graph backdoor, 1523 Hong, 2021 Gao, 2020 Qi F, Chen Y, Zhang X, Li M, Liu Z, Sun M. Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer. In: Proceedings of the 2021 conference on empirical methods in natural language processing. 2021, p. 4569–80. Koffas, 2022, Can you hear it? Backdoor attacks via ultrasonic triggers, 57 Xu, 2022 Doan BG, Abbasnejad E, Ranasinghe DC. Februus: Input purification defense against Trojan attacks on deep neural network systems. In: Annual computer security applications conference. 2020, p. 897–912.