XAI-based cross-ensemble feature ranking methodology for machine learning models

International Journal of Information Technology - Tập 15 - Trang 1759-1768 - 2023

Pei Jiang¹, Hiroyuki Suzuki², Takashi Obi^3,4

¹Course of Information and Communication, Department of Engineer, Tokyo Institute of Technology, Kanagawa, Japan

²Center for Mathematics and Data Science, Gunma University, Maebashi, Japan

³Institute of Innovative Research, Tokyo Institute of Technology, Kanagawa, Japan

⁴Yokohama Midori Ward, Japan

Tóm tắt

Artificial Intelligence (AI) as one robust technology has been used in various fields, making innovative society possible and changing our lifestyles. However, the black box problem is still one big problem for artificial intelligence. In this study, we first compared the results of kernel Shapley Additive exPlanations (SHAP) for various machine learning models and found that the single SHAP model cannot explain the models at the human knowledge level. Then the factors’ global ranking was calculated using our proposed ensemble methodology. Finally, the new factors’ ranking was compared with other factor ranking method. Our experimental results declare that the proposed cross-ensemble feature ranking methodology provides stable and comparatively reliable feature ranking in both the classification and regression models.

Tài liệu tham khảo

Alwadi M, Chetty G, Yamin M (2022) A framework for vehicle quality evaluation based on interpretable machine learning. Int J Inform Technol 15:1–8 Bodria F, Giannotti F, Guidotti R, et al (2021) Benchmarking and survey of explanation methods for black box models. arXiv preprint arXiv:2102.13076 Van den Broeck G, Lykov A, Schleich M et al (2022) On the tractability of shap explanations. J Artif Intell Res 74:851–886 Chelgani SC, Nasiri H, Alidokht M (2021) Interpretable modeling of metallurgical responses for an industrial coal column flotation circuit by xgboost and shap-a “conscious-lab’’ development. Int J Min Sci Technol 31(6):1135–1144 Chen H, Lundberg S, Lee SI (2021) Explaining Models by Propagating Shapley Values of Local Components. Stud Comput Intell 914:261–270. https://doi.org/10.1007/978-3-030-53352-6_24 Covert I, Lundberg SM, Lee SI (2021) Explaining by removing: A unified framework for model explanation. J Mach Learn Res 22:209–1 for Disease Control C, Prevention (2020) Personal key indicators of heart disease. https://www.kaggle.com/datasets/kamilpytlak/personal-key-indicators-of-heart-disease Durán JM, Jongsma KR (2021) Who is afraid of black box algorithms? on the epistemological and ethical basis of trust in medical ai. J Med Ethics 47(5):329–335 Feng DC, Wang WJ, Mangalathu S et al (2021) Interpretable xgboost-shap machine-learning model for shear strength prediction of squat rc walls. J Struct Eng 147(11):04021 Gupta S, Saini A (2021) An artificial intelligence based approach for managing risk of it systems in adopting cloud. Int J Inf Technol 13(6):2515–2523 Ministry of Health L, of Japan W (2023) https://www.mhlw.go.jp/english/index.html Jabeur SB, Mefteh-Wali S, Viviani JL (2021) Forecasting gold price with the xgboost algorithm and shap interaction values. Ann Oper Res. https://doi.org/10.1007/s10479-021-04187-w kaggle (2006) Pima indians diabetes database. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database kaggle (2023) House rent prediction dataset. https://www.kaggle.com/datasets/iamsouravbanerjee/house-rent-prediction-dataset Koklu M, Kursun R, Taspinar YS et al (2021) Classification of date fruits into genetic varieties using image analysis. Math Probl Eng 2021:1–13 Li Y, Shen Y, Zhang W, et al (2021) Openbox: A generalized black-box optimization service. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp 3209–3219 Li Z (2022) Extracting spatial effects from machine learning model using local interpretation method: An example of shap and xgboost. Comput Environ Urban Syst 96(101):845 Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. Adv Neural Inform Process Syst 30 Lundberg SM, Nair B, Vavilala MS et al (2018) Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng 2(10):749–760. https://doi.org/10.1038/s41551-018-0304-0 Meng Y, Yang N, Qian Z et al (2021) What makes an online review more helpful: An interpretation framework using xgboost and shap values. J Theor Appl Electron Commer Res 16(3):466–490. https://doi.org/10.3390/jtaer16030029 Mitrentsis G, Lens H (2022) An interpretable probabilistic model for short-term solar power forecasting using natural gradient boosting. Appl Energy 309(118):473 Molnar C (2022) Interpretable Machine Learning, 2nd edn. https://christophm.github.io/interpretable-ml-book Nehal SA, Roy D, Devi M et al (2020) Highly sensitive lab-on-chip with deep learning ai for detection of bacteria in water. Int J Inf Technol 12(2):495–501 Patil S, Patil KR, Patil CR et al (2020) Performance overview of an artificial intelligence in biomedics: a systematic approach. Int J Inf Technol 12(3):963–973 Rashid A (2020) Diabetes dataset. https://doi.org/10.17632/wj9rwkp9c2.1, https://data.mendeley.com/datasets/wj9rwkp9c2/1 Ribeiro MT, Singh S, Guestrin C (2016) "Why should i trust you?" Explaining the predictions of any classifier. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol 13-17-August-2016. Association for Computing Machinery, pp 1135–1144, https://doi.org/10.1145/2939672.2939778 Sarwar A, Ali M, Manhas J et al (2020) Diagnosis of diabetes type-ii using hybrid machine learning based ensemble model. Int J Inf Technol 12(2):419–428 Selvaraju RR, Cogswell M, Das A, et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626 Sokolovsky A, Arnaboldi L, Bacardit J, et al (2021) Explainable machine learning-driven strategy for automated trading pattern extraction. arXiv preprint arXiv:2103.12419 Wadden JJ (2022) Defining the undefinable: the black box problem in healthcare artificial intelligence. J Med Ethics 48(10):764–768 Wang D, Thunéll S, Lindberg U et al (2022) Towards better process management in wastewater treatment plants: Process analytics based on shap values for tree-based machine learning methods. J Environ Manage 301(113):941 Wang J, Wiens J, Lundberg S (2021) Shapley flow: A graph-based approach to interpreting model predictions. In: International Conference on Artificial Intelligence and Statistics, PMLR, pp 721–729 Wei CY, Luo H (2021) Non-stationary reinforcement learning without prior knowledge: An optimal black-box approach. In: Conference on Learning Theory, PMLR, pp 4300–4354 Wen X, Xie Y, Wu L et al (2021) Quantifying and comparing the effects of key risk factors on various types of roadway segment crashes with lightgbm and shap. Accid Anal Prev 159(106):261 Yang C, Chen M, Yuan Q (2021) The application of xgboost and shap to examining the factors in freight truck-related crashes: An exploratory analysis. Accid Analy Prev 158(106):153 Zhao W, Joshi T, Nair VN, et al (2020) Shap values for explaining cnn-based text classification models. arXiv preprint arXiv:2008.11825 Zhao X, Huang W, Huang X, et al (2021) Baylime: Bayesian local interpretable model-agnostic explanations. In: Uncertainty in Artificial Intelligence, PMLR, pp 887–896

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA