Ijuice: integer JUstIfied counterfactual explanations

Alejandro Kuratomi1, Ioanna Miliou1, Zed Lee1, Tony Lindgren1, Panagiotis Papapetrou1
1Department of Computer and Systems Sciences, Stockholm University, Borgarfjordsgatan 12, Kista, 16455, Stockholm, Sweden

Tóm tắt

AbstractCounterfactual explanations modify the feature values of an instance in order to alter its prediction from an undesired to a desired label. As such, they are highly useful for providing trustworthy interpretations of decision-making in domains where complex and opaque machine learning algorithms are utilized. To guarantee their quality and promote user trust, they need to satisfy the faithfulness desideratum, when supported by the data distribution. We hereby propose a counterfactual generation algorithm for mixed-feature spaces that prioritizes faithfulness through k-justification, a novel counterfactual property introduced in this paper. The proposed algorithm employs a graph representation of the search space and provides counterfactuals by solving an integer program. In addition, the algorithm is classifier-agnostic and is not dependent on the order in which the feature space is explored. In our empirical evaluation, we demonstrate that it guarantees k-justification while showing comparable performance to state-of-the-art methods in feasibility, sparsity, and proximity.

Từ khóa


Tài liệu tham khảo

Basu, A., Conforti, M., Di Summa, M., & Jiang, H. (2022). Complexity of branch-and-bound and cutting planes in mixed-integer optimization. Mathematical Programming (pp. 1–24).

Bobek, S., & Nalepa, G. J. (2019). Explainability in knowledge discovery from data streams. In 2019 first international conference on societal automation (SA) (pp. 1–4). IEEE.

Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). Lof: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on management of data (pp. 93–104).

Byrne, R. M. (2019). Counterfactuals in explainable artificial intelligence (xai): Evidence from human reasoning. In IJCAI (pp. 6276–6282).

Carreira-Perpiñán, M.Á., & Hada, S.S. (2021). Counterfactual explanations for oblique decision trees: Exact, efficient algorithms. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35, pp. 6903–6911

Cohen, M. B., Lee, Y. T., & Song, Z. (2021). Solving linear programs in the current matrix multiplication time. Journal of the ACM (JACM), 68(1), 1–39.

Dandl, S., Molnar, C., Binder, M., & Bischl, B. (2020). Multi-objective counterfactual explanations. In International conference on parallel problem solving from nature (pp. 448–469). Springer.

de Oliveira, R. M. B., & Martens, D. (2021). A framework and benchmarking study for counterfactual generating methods on tabular data. Applied Sciences, 11(16), 7274.

Eiras-Franco, C., Martinez-Rego, D., Guijarro-Berdinas, B., Alonso-Betanzos, A., & Bahamonde, A. (2019). Large scale anomaly detection in mixed numerical and categorical input spaces. Information Sciences, 487, 115–127.

Guidotti, R. (2022). Counterfactual explanations and how to find them: literature review and benchmarking. Data Mining and Knowledge Discovery (pp. 1–55).

Guidotti, R., & Ruggieri, S. (2021). Ensemble of counterfactual explainers. In Proceedings of discovery science: 24th international conference, DS 2021, Halifax, NS, Canada, October 11–13, 2021, pp. 358–368. Springer.

Guidotti, R., Monreale, A., Giannotti, F., Pedreschi, D., Ruggieri, S., & Turini, F. (2019). Factual and counterfactual explanations for black box decision making. IEEE Intelligent Systems, 34(6), 14–23.

Kanamori, K., Takagi, T., Kobayashi, K., & Arimura, H. (2020). Dace: Distribution-aware counterfactual explanation by mixed-integer linear optimization. In: IJCAI (pp. 2855–2862)

Kannan, R., & Monma, C. L. (1978). On the computational complexity of integer programming problems. Optimization and Operations Research (pp. 161–172). Chap. 17.

Karimi, A.-H., Barthe, G., Balle, B., & Valera, I. (2020). Model-agnostic counterfactual explanations for consequential decisions. In International conference on artificial intelligence and statistics (pp. 895–905). PMLR.

Karimi, A.-H., Barthe, G., Schölkopf, B., & Valera, I. (2022). A survey of algorithmic recourse: Contrastive explanations and consequential recommendations. ACM Computing Surveys, 55(5), 1–29.

Kuratomi, A., Miliou, I., Lee, Z., Lindgren, T., & Papapetrou, P. (2022). Juice: Justified counterfactual explanations. In International conference on discovery science (pp. 493–508). Springer

Kuratomi, A., Pitoura, E., Papapetrou, P., Lindgren, T., & Tsaparas, P. (2022). Measuring the burden of (un) fairness using counterfactuals. In Joint European conference on machine learning and knowledge discovery in databases, (pp. 402–417). Springer

Kyrimi, E., Neves, M. R., McLachlan, S., Neil, M., Marsh, W., & Fenton, N. (2020). Medical idioms for clinical Bayesian network development. Journal of Biomedical Informatics, 108, 103495.

Laugel, T., Lesot, M.-J., Marsala, C., & Detyniecki, M. (2019). Issues with post-hoc counterfactual explanations: a discussion. arXiv:1906.04774.

Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2017). Inverse classification for comparison-based interpretability in machine learning. arXiv:1712.08443.

Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2019). Unjustified classification regions and counterfactual explanations in machine learning. In Joint European conference on machine learning and knowledge discovery in databases (pp. 37–54). Springer.

Le Quy, T., Roy, A., Iosifidis, V., Zhang, W., & Ntoutsi, E. (2022). A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(3), 1452.

Lenstra, H. W., Jr. (1983). Integer programming with a fixed number of variables. Mathematics of Operations Research, 8(4), 538–548.

Lindgren, T., Papapetrou, P., Samsten, I., & Asker, L. (2019). Example-based feature tweaking using random forests. In 2019 IEEE 20th international conference on information reuse and integration for data science (IRI) (pp. 53–60). IEEE.

Martens, D., & Provost, F. (2014). Explaining data-driven document classifications. MIS Quarterly, 38(1), 73–100.

Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38.

Molnar, C. (2020). Interpretable machine learning: A guide for making black-box models explainable. https://christophm.github.io/interpretable-ml-book/limo.html

Mothilal, R. K., Sharma, A., & Tan, C. (2020). Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency (pp. 607–617).

Otey, M. E., Ghoting, A., & Parthasarathy, S. (2006). Fast distributed outlier detection in mixed-attribute data sets. Data Mining and Knowledge Discovery, 12, 203–228.

Papadimitriou, C. H. (1981). On the complexity of integer programming. Journal of the ACM (JACM), 28(4), 765–768.

Parmentier, A., & Vidal, T. (2021). Optimal counterfactual explanations in tree ensembles. In International conference on machine learning (pp. 8422–8431). PMLR.

Pawelczyk, M., Bielawski, S., van den Heuvel, J., Richter, T., & Kasneci, G. (2021). CARLA: A python library to benchmark algorithmic recourse and counterfactual explanation algorithms.

Pawelczyk, M., Broelemann, K., & Kasneci, G. (2020). Learning model-agnostic counterfactual explanations for tabular data. In Proceedings of the web conference 2020 (pp. 3126–3132).

Poyiadzi, R., Sokol, K., Santos-Rodriguez, R., De Bie, T., & Flach, P. (2020). Face: feasible and actionable counterfactual explanations. In Proceedings of the AAAI/ACM conference on AI, ethics, and society (pp. 344–350).

Ramon, Y., Martens, D., Provost, F., & Evgeniou, T. (2020). A comparison of instance-level counterfactual explanation algorithms for behavioral and textual data: Sedc, lime-c and shap-c. Advances in Data Analysis and Classification, 14, 801–819.

Rawal, K., & Lakkaraju, H. (2020). Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses. Advances in Neural Information Processing Systems, 33, 12187–12198.

Ribeiro, M.T., Singh, S., & Guestrin, C. (2016). “why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144

Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.

Russell, C. (2019). Efficient search for diverse coherent explanations. In Proceedings of the conference on fairness, accountability, and transparency (pp. 20–28).

Sharma, S., Henderson, J., & Ghosh, J. (2020). CERTIFAI: Counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models. Proceedings of the AAAI/ACM conference on AI, ethics, and society (pp. 166–172). https://doi.org/10.1145/3375627.3375812. arXiv: 1905.07857. Accessed 2022-03-05.

Tolomei, G., Silvestri, F., Haines, A., & Lalmas, M. (2017). Interpretable predictions of tree-based ensembles via actionable feature tweaking. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 465–474).

Ustun, B., Spangher, A., & Liu, Y. (2019). Actionable recourse in linear classification. In Proceedings of the conference on fairness, accountability, and transparency (pp. 10–19)

Verma, S., Dickerson, J., & Hines, K. (2020) Counterfactual explanations for machine learning: A review. arXiv:2010.10596.

Vermeire, T., Brughmans, D., Goethals, S., de Oliveira, R. M. B., & Martens, D. (2022). Explainable image classification with evidence counterfactual. Pattern Analysis and Applications, 25(2), 315–335.

Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harv. JL and Tech., 31, 841.

Wexler, J., Pushkarna, M., Bolukbasi, T., Wattenberg, M., Viégas, F., & Wilson, J. (2019). The what-if tool: Interactive probing of machine learning models. IEEE Transactions on Visualization and Computer Graphics, 26(1), 56–65.

White, A., & Garcez, A. (2019). Measurable counterfactual local explanations for any classifier. arXiv:1908.03020

Yang, L., Kenny, E. M., Ng, T. L. J., Yang, Y., Smyth, B., & Dong, R. (2020). Generating plausible counterfactual explanations for deep transformers in financial text classification. arXiv:2010.12512.