Explainable artificial intelligence for cybersecurity: a literature survey

Annales Des Télécommunications - Tập 77 - Trang 789-812 - 2022
Fabien Charmet1, Harry Chandra Tanuwidjaja1, Solayman Ayoubi2, Pierre-François Gimenez3, Yufei Han4, Houda Jmila5, Gregory Blanc5, Takeshi Takahashi1, Zonghua Zhang6
1National Institute of Information and Communications Technology, Tokyo, Japan
2LORIA, Universite de Lorraine, Lorraine, France
3CentraleSupélec, IRISA, University Rennes, Rennes, France
4Inria, IRISA, University Rennes, Rennes, France
5SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France
6Huawei Paris Research Center, Paris, France

Tóm tắt

With the extensive application of deep learning (DL) algorithms in recent years, e.g., for detecting Android malware or vulnerable source code, artificial intelligence (AI) and machine learning (ML) are increasingly becoming essential in the development of cybersecurity solutions. However, sharing the same fundamental limitation with other DL application domains, such as computer vision (CV) and natural language processing (NLP), AI-based cybersecurity solutions are incapable of justifying the results (ranging from detection and prediction to reasoning and decision-making) and making them understandable to humans. Consequently, explainable AI (XAI) has emerged as a paramount topic addressing the related challenges of making AI models explainable or interpretable to human users. It is particularly relevant in cybersecurity domain, in that XAI may allow security operators, who are overwhelmed with tens of thousands of security alerts per day (most of which are false positives), to better assess the potential threats and reduce alert fatigue. We conduct an extensive literature review on the intersection between XAI and cybersecurity. Particularly, we investigate the existing literature from two perspectives: the applications of XAI to cybersecurity (e.g., intrusion detection, malware classification), and the security of XAI (e.g., attacks on XAI pipelines, potential countermeasures). We characterize the security of XAI with several security properties that have been discussed in the literature. We also formulate open questions that are either unanswered or insufficiently addressed in the literature, and discuss future directions of research.

Tài liệu tham khảo

2018 reform of EU data protection rules. European Commission. May 25, 2018 (visited on 07/25/2022). https://ec.europa.eu/info/sites/default/files/data-protection-factsheet-changes_en.pdf

Adadi A, Berrada M (2018) Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). In: IEEE Access, vol 6, pp 52138–52160. https://doi.org/10.1109/ACCESS.2018.2870052

Ahmed M et al (eds) (2022) Explainable artificial intelligence for Cyber security. Springer International Publishing, Berlin. https://doi.org/10.1007/978-3-030-96630-0

Berg T et al (2014) Birdsnap: Large-scale finegrained visual categorization of birds. In: 2014 IEEE conference on computer vision and pattern recognition, pp 2019–2026. https://doi.org/10.1109/CVPR.2014.259

Breiman L (2001) Random forests. In: Machine learning, vol 45.1, pp 5–32

CSE-CIC-IDS2018 on AWS. Accessed: 2022-03-25 (2018) https://www.unb.ca/cic/datasets/ids-2018.html

Dellermann D et al (2019) Hybrid intelligence. In: Business & information systems engineering, vol 61.5, pp 637–643

Deng J et al (2009) ImageNet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848

Deng J et al (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee, pp 248–255

Dimanov B et al (2020) You Shouldn’t Trust Me: Learning models which conceal unfairness from multiple explanation methods. In: SafeAI@AAAI

Došilović FK, Brčcić M, Hlupić N (2018) Explainable artificial intelligence: A survey. In: 2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO), pp 0210–0215. https://doi.org/10.23919/MIPRO.2018.8400040

Faraway JJ (2016) Extending the linear model with R. Chapman and Hall/CRC. https://doi.org/10.1201/9781315382722

Guo W et al (2018) Lemna: Explaining deep learning based security applications. In: proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 364–379

Hagras H (2018) Toward Human-Understandable Explainable AI. In: Computer, vol 51.9, pp 28–36. https://doi.org/10.1109/MC.2018.3620965

Hanif A, Zhang X, Wood S (2021) A survey on explainable artificial intelligence techniques and challenges. In: 2021 IEEE 25th international enterprise distributed object computing workshop (EDOCW), pp 81–89. https://doi.org/10.1109/EDOCW52865.2021.00036

Hastie T et al (2009) The elements of statistical learning: data mining, inference, and prediction, vol 2. Springer, Berlin

Kirchner L, Larson J, Mattu S, Angwin J (2020) Propublica Recidivism Dataset. https://www.propublica.org/datastore/dataset/compas-recidivism-risk-score-data-and-analysis. Accessed 01 Aug 2022

Kleinbaum DG et al (2002) Logistic regression. Springer, New York

Kohavi R, Becker B (2020) UCI - Adult Dataset. https://archive.ics.uci.edu/ml/datasets/adult

Koroniotis N et al (2019) Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. In: Future generation computer systems. issn: 0167-739X, vol 100, pp 779–796. https://doi.org/10.1016/j.future.2019.05.041

Ciontos A, Fenoy LM (2020) Performance evaluation of explainable ai methods against adversarial noise

Marino DL, Wickramasinghe CS, Manic M (2018) An adversarial approach for explainable ai in intrusion detection systems. In: IECON 2018-44th annual conference of the IEEE industrial electronics society. IEEE, pp 3237–3243

Molnar C (2018) A guide for making black box models explainable. In: https://christophm.github.io/interpretable-ml-book. Accessed 01 Aug 2022

Moustafa N (2019) New generations of internet of things datasets for cybersecurity applications based machine learning: TON IoT datasets. In: Proceedings of the eResearch Australasia Conference. Brisbane, Australia, pp 21–25

Paredes J et al (2021) On the importance of domainspecific explanations in AI-based cybersecurity systems (Technical Report). In: arXiv:2108.02006

Pierazzi F et al (2020) Intriguing properties of adversarial ML Attacks in the problem space. English. In: 2020 IEEE symposium on security and privacy. issn: 2375–1207, pp 1332–1349. https://doi.org/10.1109/SP40000.2020.00073

Slack DZ et al (2021) Reliable Post hoc Explanations: Modeling Uncertainty in Explainability. In: Beygelzimer A et al (eds) Advances in neural information processing systems. https://openreview.net/forum?id=rqfq0CYIekd. Accessed 01 Aug 2022

Virus Share: Virus Report Sharing. Accessed: 2022-03-22. https://virusshare.com

Wali S, Khan I (2021) Explainable AI and random forest based reliable intrusion detection system. In: https://doi.org/10.36227/techrxiv.17169080.v1

Zeng X, Martinez T (2001) Distribution-balanced stratified cross-validation for accuracy estimation. In: Journal of experimental & theoretical artificial intelligence vol 12. https://doi.org/10.1080/095281300146272