Re-interpreting rules interpretability

International Journal of Data Science and Analytics - Trang 1-21 - 2023

Linara Adilova^1,2, Michael Kamp^1,3,4, Gennady Andrienko^2,5, Natalia Andrienko^2,5

¹Ruhr University Bochum, Bochum, Germany

²Fraunhofer Institute IAIS, Sankt Augustin, Germany

³IKIM, University Medicine Essen, Essen, Germany

⁴Monash University, Melbourne, Australia

⁵City, University of London, London, UK

Tóm tắt

Trustworthy machine learning requires a high level of interpretability of machine learning models, yet many models are inherently black-boxes. Training interpretable models instead—or using them to mimic the black-box model—seems like a viable solution. In practice, however, these interpretable models are still unintelligible due to their size and complexity. In this paper, we present an approach to explain the logic of large interpretable models that can be represented as sets of logical rules by a simple, and thus intelligible, descriptive model. The coarseness of this descriptive model and its fidelity to the original model can be controlled, so that a user can understand the original model in varying levels of depth. We showcase and discuss this approach on three real-world problems from healthcare, material science, and finance.

Tài liệu tham khảo

Ribeiro, M., Singh, S., Guestrin, C.: “Why Should I Trust You?”: explaining the predictions of any classifier. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp. 97–101. Association for Computational Linguistics, San Diego (2016) Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), 1–42 (2018). https://doi.org/10.1145/3236009 Kovalerchuk, B., Ahmad, M.A., Teredesai, A.: Survey of explainable machine learning with visual and granular methods beyond quasi-explanations. In: Pedrycz, W., Chen, S.M. (eds.) Interpretable artificial intelligence: a perspective of granular computing, pp. 217–267. Springer (2021) Letham, B., Rudin, C., McCormick, T.H., Madigan, D.: Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model. Ann. Appl. Stat. 9(3), 1350–1371 (2015) Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), 1–42 (2018) Quinlan, J.R.: Generating production rules from decision trees. In: Proceedings of the 10th International Joint Conference on Artificial Intelligence—Volume 1. IJCAI’87, pp. 304–307. Morgan Kaufmann Publishers Inc. (1987) Arya, V., Bellamy, R.K., Chen, P.Y., Dhurandhar, A., Hind, M., Hoffman, S.C., et al.: One explanation does not fit all: a toolkit and taxonomy of ai explainability techniques. arXiv preprint arXiv:1909.03012 (2019) Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., Baesens, B.: An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Support Syst. 51(1), 141–154 (2011) Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. Ann. Appl. Stat. 2(3), 916–954 (2008) Al-Akhras, M., El Hindi, K., Habib, M., Shawar, B.A., et al.: Instance reduction for avoiding overfitting in decision trees. J. Intell. Syst. 30(1), 438–459 (2021) Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000) Esposito, F., Malerba, D., Semeraro, G., Kay, J.: A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 476–491 (1997). https://doi.org/10.1109/34.589207 Helmbold, D.P., Schapire, R.E.: Predicting nearly as well as the best pruning of a decision tree. Mach. Learn. 27(1), 51–68 (1997) Dash, S., Gunluk, O., Wei, D.: Boolean decision rules via column generation. Adv. Neural Inf. Process. Syst. 31, 4655–4665 (2018) Su, G., Wei, D., Varshney, K.R., Malioutov, D.M.: Learning sparse two-level boolean rules. In: IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6 IEEE (2016) Lakkaraju, H., Bach, S.H., Leskovec, J.: Interpretable decision sets: a joint framework for description and prediction. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1675–1684 (2016) Wang, T., Rudin, C., Doshi-Velez, F., Liu, Y., Klampfl, E., MacNeille, P.: A Bayesian framework for learning rule sets for interpretable classification. J. Mach. Learn. Res. 18(1), 2357–2393 (2017) Joly, A., Schnitzler, F., Geurts, P., Wehenkel, L.: L1-based compression of random forest models. In: 20th European Symposium on Artificial Neural Networks (2012) Painsky, A., Rosset, S.: Lossless compression of random forests. J. Comput. Sci. Technol. 34(2), 494–506 (2019) BuciluǎC, Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541 (2006) Bohanec, M., Bratko, I.: Trading accuracy for simplicity in decision trees. Mach. Learn. 15(3), 223–250 (1994) Qiao, L., Wang, W., Lin, B.: Learning accurate and interpretable decision rule sets from neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 35, pp. 4303–4311 (2021) Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD explorations newsletter, pp. 1–10 (2014) Quinlan, J.R.: Simplifying decision trees. Int. J. Man Mach. Stud. 27(3), 221–234 (1987) Bénard, C., Biau, G., Veiga, S., Scornet, E.: Interpretable random forests via rule extraction. In: International Conference on Artificial Intelligence and Statistics. pp. 937–945. PMLR (2021) Izza, Y., Ignatiev, A., Marques-Silva, J.: On explaining decision trees. arXiv preprint arXiv:2010.11034 (2020) Hulot, A., Chiquet, J., Jaffrezic, F., Rigaill, G.: Fast tree aggregation for consensus hierarchical clustering: application to multi-omics data analysis. In: Statistical Methods for Post-Genomic Data (SMPGD) (2019) Kavšek, B., Lavrač, N., Ferligoj, A.: Consensus decision trees: using consensus hierarchical clustering for data relabelling and reduction. In: European Conference on Machine Learning, pp. 251–262. Springer (2001) Strecht, P., Mendes-Moreira, J., Soares, C.: Inmplode: a framework to interpret multiple related rule-based models. Expert Syst. 38(6), e12702 (2021) Andrzejak, A., Langner, F., Zabala, S.: Interpretable models from distributed data via merging of decision trees. In: IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 1–9. IEEE (2013) Andrienko, N., Andrienko, G., Fuchs, G., Slingsby, A., Turkay, C., Wrobel, S.: Visual analytics for data scientists. Springer (2020) Sacha, D., Kraus, M., Keim, D.A., Chen, M.: VIS4ML: an ontology for visual analytics assisted machine learning. IEEE Trans. Vis. Comput. Graph. 25(1), 385–395 (2019). https://doi.org/10.1109/TVCG.2018.2864838 Andrienko, N., Lammarsch, T., Andrienko, G., Fuchs, G., Keim, D., Miksch, S., et al.: Viewing visual analytics as model building. Comput. Graph. Forum 37(6), 275–299 (2018). https://doi.org/10.1111/cgf.13324 Andrienko, N., Andrienko, G., Miksch, S., Schumann, H., Wrobel, S.: A theoretical model for pattern discovery in visual analytics. Vis. Inf. 5(1), 23–42 (2021). https://doi.org/10.1016/j.visinf.2020.12.002 Spinner, T., Schlegel, U., Schäfer, H., El-Assady, M.: explAIner: a visual analytics framework for interactive and explainable machine learning. IEEE Trans. Vis. Comput. Graph. 26(1), 1064–1074 (2020). https://doi.org/10.1109/TVCG.2019.2934629 Ming, Y., Qu, H., Bertini, E.: RuleMatrix: visualizing and understanding classifiers with rules. IEEE Trans. Vis. Comput. Graph. 25(1), 342–352 (2019). https://doi.org/10.1109/TVCG.2018.2864812 Yuan, J., Nov, O., Bertini, E.: Visualizing rule sets: exploration and validation of a design space. arXiv preprint arXiv:2103.01022 (2021) Rote, G.: Computing the minimum Hausdorff distance between two point sets on a line under translation. Inf. Process. Lett. 38(3), 123–127 (1991) Jaccard, P.: The distribution of the flora in the alpine zone. New Phytol. 11(2), 37–50 (1912). https://doi.org/10.1111/j.1469-8137.1912.tb05611.x van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 11(9), 2579–2605 (2008) Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: ordering points to identify the clustering structure. SIGMOD Rec. 28(2), 49–60 (1999). https://doi.org/10.1145/304181.304187 Dua, D., Graff, C.: UCI machine learning repository. http://archive.ics.uci.edu/ml Ayres-de Campos, D., Bernardes, J., Garrido, A., Marques-de Sa, J., Pereira-Leite, L.: SisPorto 2.0: a program for automated analysis of cardiotocograms. J. Matern. Fetal Med. 9(5), 311–318 (2000) Sutton, C., Ghiringhelli, L.M., Yamamoto, T., Lysogorskiy, Y., Blumenthal, L., Hammerschmidt, T., et al.: Crowd-sourcing materials-science challenges with the NOMAD 2018 Kaggle competition. NPJ Comput. Mater. 5(1), 1–11 (2019) Bartók, A.P., Payne, M.C., Kondor, R., Csányi, G.: Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104(13), 136403 (2010) Huo, H., Rupp, M.: Unified representation of molecules and crystals for machine learning. arXiv preprint arXiv:1704.06439 (2017) Sutton, C., Boley, M., Ghiringhelli, L.M., Rupp, M., Vreeken, J., Scheffler, M.: Identifying domains of applicability of machine learning models for materials science. Nat. Commun. 11(1), 1–9 (2020) Chen, C., Lin, K., Rudin, C., Shaposhnik, Y., Wang, S., Wang, T.: A holistic approach to interpretability in financial lending: models, visualizations, and summary-explanations. Decis. Support Syst. 152, 113647 (2022). https://doi.org/10.1016/j.dss.2021.113647

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA