Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
Tóm tắt
Từ khóa
Tài liệu tham khảo
Russell, 2016
West, 2018
Goodman, 2017, European union regulations on algorithmic decision-making and a “right to explanation”, AI Magazine, 38, 50, 10.1609/aimag.v38i3.2741
A. Preece, D. Harborne, D. Braines, R. Tomsett, S. Chakraborty, Stakeholders in Explainable AI, 2018.
Gunning, 2017, Explainable artificial intelligence (xAI)
E. Tjoa, C. Guan, A survey on explainable artificial intelligence (XAI): Towards medical XAI, 2019.
Zhu, 2018, Explainable AI for designers: A human-centered perspective on mixed-initiative co-creation, 2018 IEEE Conference on Computational Intelligence and Games (CIG), 1
Dos̃ilović, 2018, Explainable artificial intelligence: A survey, 210
P. Hall, On the Art and Science of Machine Learning Explanations, 2018.
Miller, 2019, Explanation in artificial intelligence: Insights from the social sciences, Artif. Intell., 267, 1, 10.1016/j.artint.2018.07.007
L.H. Gilpin, D. Bau, B.Z. Yuan, A. Bajwa, M. Specter, L. Kagal, Explaining Explanations: An Overview of Interpretability of Machine Learning, 2018.
Adadi, 2018, Peeking inside the black-box: A survey on explainable artificial intelligence (XAI), IEEE Access, 6, 52138, 10.1109/ACCESS.2018.2870052
Biran, 2017, Explanation and justification in machine learning: A survey, 8, 1
Shane T. Mueller, 2019, Explanation in Human-AI Systems: A Literature Meta-Review Synopsis of Key Ideas and Publications and Bibliography for Explainable AI
Guidotti, 2018, A survey of methods for explaining black box models, ACM Computing Surveys, 51, 93:1
Montavon, 2018, Methods for interpreting and understanding deep neural networks, Digital Signal Processing, 73, 1, 10.1016/j.dsp.2017.10.011
Fernandez, 2019, Evolutionary fuzzy systems for explainable artificial intelligence: Why, when, what for, and where to?, IEEE Computational Intelligence Magazine, 14, 69, 10.1109/MCI.2018.2881645
Gleicher, 2016, A framework for considering comprehensibility in modeling, Big data, 4, 75, 10.1089/big.2016.0007
Craven, 1996, Extracting comprehensible models from trained neural networks
Michalski, 1983, A theory and methodology of inductive learning, 83
Díez, 2013, General theories of explanation: buyer beware, Synthese, 190, 379, 10.1007/s11229-011-0020-8
D. Doran, S. Schulz, T.R. Besold, What does explainable AI really mean? a new conceptualization of perspectives, 2017.
F. Doshi-Velez, B. Kim, Towards a rigorous science of interpretable machine learning, 2017.
Vellido, 2012, Making machine learning models interpretable., 12, 163
Walter, 2008
Besnard, 2008
F. Rossi, AI Ethics for Enterprise AI, 2019.
A. Holzinger, C. Biemann, C.S. Pattichis, D.B. Kell, What do we need to build explainable Ai systems for the medical domain?, 2017.
Kim, 2015, iBCM: Interactive Bayesian case model empowering humans via intuitive interaction
Ribeiro, 2016, Why should I trust you?: Explaining the predictions of any classifier, 1135
M. Fox, D. Long, D. Magazzeni, Explainable planning, 2017.
Lane, 2005, Explainable artificial intelligence for training and tutoring
W.J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, B. Yu, Interpretable machine learning: definitions, methods, and applications, 2019.
Haspiel, 2018, Explanations and expectations: Trust building in automated vehicles, 119
Chander, 2018, Working with beliefs: AI transparency in the enterprise.
Tickle, 1998, The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks, IEEE Transactions on Neural Networks, 9, 1057, 10.1109/72.728352
Louizos, 2017, Causal effect inference with deep latent-variable models, 6446
Goudet, 2018, Learning functional causal models with generative neural networks, 39
Athey, 2015, Machine learning methods for estimating heterogeneous causal effects, stat, 1050
Lopez-Paz, 2017, Discovering causal signals in images, 6979
C. Barabas, K. Dinakar, J. Ito, M. Virza, J. Zittrain, Interventions over predictions: Reframing the ethical debate for actuarial risk assessment, 2017.
Caruana, 2015, Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission, 1721
Theodorou, 2017, Designing and implementing transparency for real time inspection of autonomous robots, Connection Science, 29, 230, 10.1080/09540091.2017.1310182
W. Samek, T. Wiegand, K.-R. Müller, Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models, 2017.
C. Wadsworth, F. Vera, C. Piech, Achieving fairness through adversarial learning: an application to recidivism prediction, 2018.
Yuan, 2019, Adversarial examples: Attacks and defenses for deep learning, IEEE Transactions on Neural Networks and Learning Systems, 30, 2805, 10.1109/TNNLS.2018.2886017
Letham, 2015, Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model, The Annals of Applied Statistics, 9, 1350, 10.1214/15-AOAS848
Harbers, 2010, Design and evaluation of explainable BDI agents, 2, 125
Aung, 2007, Comparing analytical decision support models through boolean rule extraction: A case study of ovarian tumour malignancy, 1177
A. Weller, Challenges for transparency, 2017.
Freitas, 2014, Comprehensible classification models: a position paper, ACM SIGKDD explorations newsletter, 15, 1, 10.1145/2594473.2594475
Schetinin, 2007, Confident interpretation of bayesian decision tree ensembles for clinical applications, IEEE Transactions on Information Technology in Biomedicine, 11, 312, 10.1109/TITB.2006.880553
Martens, 2011, Performance of classification models from a user perspective, Decision Support Systems, 51, 782, 10.1016/j.dss.2011.01.013
Che, 2016, Interpretable deep models for ICU outcome prediction, 2016, 371
Barakat, 2008, Eclectic rule-extraction from support vector machines, International Journal of Computer, Electrical, Automation, Control and Information Engineering, 2, 1672
F.J.C. Garcia, D.A. Robb, X. Liu, A. Laskov, P. Patron, H. Hastie, Explain yourself: A natural language interface for scrutable autonomous robots, 2018.
Langley, 2017, Explainable agency for intelligent autonomous systems, 4762
Montavon, 2017, Explaining nonlinear classification decisions with deep taylor decomposition, Pattern Recognition, 65, 211, 10.1016/j.patcog.2016.11.008
P.-J. Kindermans, K.T. Schütt, M. Alber, K.-R. Müller, D. Erhan, B. Kim, S. Dähne, Learning how to explain neural networks: Patternnet and patternattribution, 2017.
Ras, 2018, Explanation methods in deep learning: Users, values, concerns and challenges, 19
Bach, 2016, Controlling explanatory heatmap resolution and semantics via decomposition depth, 2271
G.J. Katuwal, R. Chen, Machine learning model interpretability for precision medicine, 2016.
Neerincx, 2018, Using perceptual and cognitive explanations for enhanced human-agent team performance, 204
Olden, 2002, Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks, Ecological modelling, 154, 135, 10.1016/S0304-3800(02)00064-9
Krause, 2016, Interacting with predictions: Visual inspection of black-box machine learning models, 5686
Rosenbaum, 2011, Interpreting linear support vector machine models with heat map molecule coloring, Journal of Cheminformatics, 3, 11, 10.1186/1758-2946-3-11
Tan, 2014, Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders, 132
Krening, 2017, Learning from explanations using sentiment and advice in RL, IEEE Transactions on Cognitive and Developmental Systems, 9, 44, 10.1109/TCDS.2016.2628365
M.T. Ribeiro, S. Singh, C. Guestrin, Model-agnostic interpretability of machine learning, 2016.
Bach, 2015, On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation, PloS one, 10, e0130140, 10.1371/journal.pone.0130140
Etchells, 2006, Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach, IEEE Transactions on Neural Networks, 17, 374, 10.1109/TNN.2005.863472
Zhang, 2017, Plan explicability and predictability for robot task planning, 1313
Santoro, 2017, A simple neural network module for relational reasoning, 4967
Peng, 2002, The use and interpretation of logistic regression in higher education journals: 1988–1999, Research in Higher Education, 43, 259, 10.1023/A:1014858517172
Üstün, 2007, Visualisation and interpretation of support vector regression models, Analytica Chimica Acta, 595, 299, 10.1016/j.aca.2007.03.023
Zhang, 2019, Interpreting CNNs via decision trees, 6261
Wu, 2018, Beyond sparsity: Tree regularization of deep models for interpretability, 1670
G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network, 2015.
N. Frosst, G. Hinton, Distilling a neural network into a soft decision tree, 2017.
Augasta, 2012, Reverse engineering the neural networks for rule extraction in classification problems, Neural Processing Letters, 35, 131, 10.1007/s11063-011-9207-8
Zhou, 2003, Extracting symbolic rules from trained neural network ensembles, AI Communications, 16, 3
H.F. Tan, G. Hooker, M.T. Wells, Tree space prototypes: Another look at making tree ensembles interpretable, 2016.
Fong, 2017, Interpretable explanations of black boxes by meaningful perturbation, 3429
Miller, 2017, Explainable AI: Beware of inmates running the asylum, 36, 36
Goebel, 2018, Explainable AI: the new 42?, 295
Belle, 2017, Logic meets probability: Towards explainable AI systems for uncertain worlds, 5116
Edwards, 2017, Slave to the algorithm: Why a right to an explanation is probably not the remedy you are looking for, Duke L. & Tech. Rev., 16, 18
Lou, 2013, Accurate intelligible models with pairwise interactions, 623
Xu, 2015, Show, attend and tell: Neural image caption generation with visual attention, 2048
Huysmans, 2011, An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models, Decision Support Systems, 51, 141, 10.1016/j.dss.2010.12.003
Barakat, 2007, Rule extraction from support vector machines: A sequential covering approach, IEEE Transactions on Knowledge and Data Engineering, 19, 729, 10.1109/TKDE.2007.190610
Adriana da Costa, 2005, Fuzzy rule extraction from support vector machines, 335
Martens, 2007, Comprehensible credit scoring models using rule extraction from support vector machines, European Journal of Operational Research, 183, 1466, 10.1016/j.ejor.2006.04.051
Zhou, 2016, Learning deep features for discriminative localization, 2921
Krishnan, 1999, Extracting decision trees from trained neural networks, Pattern Recognition, 32, 1999, 10.1016/S0031-3203(98)00181-2
Fu, 2004, Extracting the knowledge embedded in support vector machines, 1, 291
Green, 2018, “Fair” risk assessments: A precarious approach for criminal justice reform
Chouldechova, 2017, Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Big Data, 5, 153, 10.1089/big.2016.0047
Kim, 2018, Fairness through computationally-bounded awareness, 4842
Haasdonk, 2005, Feature space interpretation of SVMs with indefinite kernels, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 482, 10.1109/TPAMI.2005.78
Palczewska, 2014, Interpreting random forest classification models using a feature contribution method, 193
S.H. Welling, H.H. Refsgaard, P.B. Brockhoff, L.H. Clemmensen, Forest floor visualizations of random forests, 2016.
Fung, 2005, Rule extraction from linear support vector machines, 32
Zhang, 2005, Rule extraction from trained support vector machines, 61
D. Linsley, D. Shiebler, S. Eberhardt, T. Serre, Global-and-local attention networks for visual recognition, 2018.
Zhou, 2008, Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling, Fuzzy Sets and Systems, 159, 3091, 10.1016/j.fss.2008.05.016
Burrell, 2016, How the machine ‘thinks’: Understanding opacity in machine learning algorithms, Big Data & Society, 3, 1, 10.1177/2053951715622512
A. Shrikumar, P. Greenside, A. Shcherbina, A. Kundaje, Not just a black box: Learning important features through propagating activation differences, 2016.
Dong, 2017, Improving interpretability of deep neural networks with semantic information, 4306
Ridgeway, 1998, Interpretable boosted naïve bayes classification., 101
Zhang, 2018, Interpretable convolutional neural networks, 8827
Seo, 2017, Interpretable convolutional neural networks with dual local and global attention for review rating prediction, 297
Larsen, 2000, Interpreting parameters in the logistic regression model with random effects, Biometrics, 56, 909, 10.1111/j.0006-341X.2000.00909.x
Gaonkar, 2015, Interpreting support vector machine models for multivariate group wise analysis in neuroimaging, Medical image analysis, 24, 190, 10.1016/j.media.2015.06.008
K. Xu, D.H. Park, C. Yi, C. Sutton, Interpreting deep classifier by visual distillation of dark knowledge, 2018.
Domingos, 1998, Knowledge discovery via multiple models, Intelligent Data Analysis, 2, 187, 10.1016/S1088-467X(98)00023-7
Tan, 2018, Distill-and-compare: Auditing black-box models using transparent model distillation, 303
Berk, 2013, Statistical procedures for forecasting criminal behavior: A comparative assessment, Criminology & Public Policy, 12, 513, 10.1111/1745-9133.12047
S. Hara, K. Hayashi, Making tree ensembles interpretable, 2016.
A. Henelius, K. Puolamäki, A. Ukkonen, Interpreting classifiers through attribute interactions in datasets, 2017.
Hastie, 2017, MIRIAM: a multimodal chat-based interface for autonomous systems, 495
Bau, 2017, Network dissection: Quantifying interpretability of deep visual representations, 6541
Núñez, 2002, Rule extraction from support vector machines., 107
Núñez, 2006, Rule-based learning systems for support vector machines, Neural Processing Letters, 24, 1, 10.1007/s11063-006-9007-8
M. Kearns, S. Neel, A. Roth, Z.S. Wu, Preventing fairness gerrymandering: Auditing and learning for subgroup fairness, 2017.
E. Akyol, C. Langbort, T. Basar, Price of transparency in strategic machine learning, 2016.
Erhan, 2010, Understanding representations learned in deep architectures, Department dInformatique et Recherche Operationnelle, University of Montreal, QC, Canada, Tech. Rep, 1355, 1
Y. Zhang, B. Wallace, A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification, 2015.
Quinlan, 1987, Simplifying decision trees, International journal of man-machine studies, 27, 221, 10.1016/S0020-7373(87)80053-6
Y. Zhou, G. Hooker, Interpreting models via single tree approximation, 2016.
Navia-Vázquez, 2006, Support vector machine interpretation, Neurocomputing, 69, 1754, 10.1016/j.neucom.2005.12.118
J.J. Thiagarajan, B. Kailkhura, P. Sattigeri, K.N. Ramamurthy, Treeview: Peeking into deep neural networks via feature-space partitioning, 2016.
Zeiler, 2014, Visualizing and understanding convolutional networks, 818
Mahendran, 2015, Understanding deep image representations by inverting them, 5188
Wagner, 2019, Interpretable and fine-grained visual explanations for convolutional neural networks, 9097
Kanehira, 2019, Learning to explain with complemental examples, 8603
D.W. Apley, Visualizing the effects of predictor variables in black box supervised learning models, 2016.
Staniak, 2018, Explanations of Model Predictions with live and breakDown Packages, The R Journal, 10, 395, 10.32614/RJ-2018-072
Zeiler, 2010, Deconvolutional networks., 10, 7
J.T. Springenberg, A. Dosovitskiy, T. Brox, M. Riedmiller, Striving for simplicity: The all convolutional net, 2014.
B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, R. Sayres, Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV), 2017.
A. Polino, R. Pascanu, D. Alistarh, Model compression via distillation and quantization, 2018.
W.J. Murdoch, A. Szlam, Automatic rule extraction from long short term memory networks, 2017.
Craven, 1994, Using sampling and queries to extract rules from trained neural networks, 37
Arbatli, 1997, Rule extraction from trained neural networks using genetic algorithms, Nonlinear Analysis: Theory, Methods & Applications, 30, 1639, 10.1016/S0362-546X(96)00267-2
Johansson, 2009, Evolving decision trees using oracle guides, 238
A. Radford, R. Jozefowicz, I. Sutskever, Learning to generate reviews and discovering sentiment, 2017.
R.R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, D. Batra, Grad-CAM: Why did you say that?, 2016.
R. Shwartz-Ziv, N. Tishby, Opening the black box of deep neural networks via information, 2017.
J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, H. Lipson, Understanding neural networks through deep visualization, 2015.
Pope, 2019, Explainability methods for graph convolutional neural networks, 10772
P. Gajane, M. Pechenizkiy, On formalizing fairness in prediction with machine learning, 2017.
C. Dwork, C. Ilvento, Composition of fairsystems, 2018.
Barocas, 2019
Wang, 1999, Smoking and the occurence of alzheimer’s disease: Cross-sectional and longitudinal data in a population-based study, American journal of epidemiology, 149, 640, 10.1093/oxfordjournals.aje.a009864
Rani, 2006, An empirical study of machine learning techniques for affect recognition in human–robot interaction, Pattern Analysis and Applications, 9, 58, 10.1007/s10044-006-0025-y
Pearl, 2009
Kuhn, 2013, 26
James, 2013, 112
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, 2013.
Ruppert, 1987
Basu, 2018, Iterative random forests to discover predictive and stable high-order interactions, Proceedings of the National Academy of Sciences, 115, 1943, 10.1073/pnas.1711236115
K. Burns, L.A. Hendricks, K. Saenko, T. Darrell, A. Rohrbach, Women also Snowboard: Overcoming Bias in Captioning Models, 2018.
Bennetot, 2019, Towards explainable neural-symbolic visual reasoning
Tibshirani, 1996, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267, 10.1111/j.2517-6161.1996.tb02080.x
Lou, 2012, Intelligible models for classification and regression, 150
Kawaguchi, 2016, Deep learning without poor local minima, 586
Datta, 2016, Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems, 598
Bursac, 2008, Purposeful selection of variables in logistic regression, Source code for biology and medicine, 3, 17, 10.1186/1751-0473-3-17
Jaccard, 2001
Hosmer Jr, 2013, 398
Peng, 2002, An introduction to logistic regression analysis and reporting, The journal of educational research, 96, 3, 10.1080/00220670209598786
Hoffrage, 1998, Using natural frequencies to improve diagnostic inferences, Academic medicine, 73, 538, 10.1097/00001888-199805000-00024
Mood, 2010, Logistic regression: Why we cannot do what we think we can do, and what we can do about it, European sociological review, 26, 67, 10.1093/esr/jcp006
Laurent, 1976, Constructing optimal binary decision trees is Np-complete, Information processing letters, 5, 15, 10.1016/0020-0190(76)90095-8
Utgoff, 1989, Incremental induction of decision trees, Machine learning, 4, 161, 10.1023/A:1022699900025
Rokach, 2014, 69
Rovnyak, 1994, Decision trees for real-time transient stability prediction, IEEE Transactions on Power Systems, 9, 1417, 10.1109/59.336122
Nefeslioglu, 2010, Assessment of landslide susceptibility by decision trees in the metropolitan area of istanbul, turkey, Mathematical Problems in Engineering, 2010, 10.1155/2010/901095
Imandoust, 2013, Application of k-nearest neighbor (knn) approach for predicting economic events: Theoretical background, International Journal of Engineering Research and Applications, 3, 605
Li, 2004, Application of the GA/KNN method to SELDI proteomics data, Bioinformatics, 20, 1638, 10.1093/bioinformatics/bth098
Guo, 2004, An KNN model-based approach and its application in text categorization, 559
Jiang, 2012, An improved k-nearest-neighbor algorithm for text categorization, Expert Systems with Applications, 39, 1503, 10.1016/j.eswa.2011.08.040
Johansson, 2004, The truth is in there-rule extraction from opaque models using genetic programming., 658
Quinlan, 1987, Generating production rules from decision trees., 87, 304
Langley, 1995, Applications of machine learning and rule induction, Communications of the ACM, 38, 54, 10.1145/219717.219768
Berg, 2007, Bankruptcy prediction by generalized additive models, Applied Stochastic Models in Business and Industry, 23, 129, 10.1002/asmb.658
Calabrese, 2012, Estimating bank loans loss given default by generalized additive models, UCD Geary Institute Discussion Paper Series, WP2012/24
Taylan, 2007, New approaches to regression by generalized additive models and continuous optimization for modern applications in finance, science and technology, Optimization, 56, 675, 10.1080/02331930701618740
Murase, 2009, Application of a generalized additive model (GAM) to reveal relationships between environmental factors and distributions of pelagic fish and krill: a case study in sendai bay, Japan, ICES Journal of Marine Science, 66, 1417, 10.1093/icesjms/fsp105
Tomić, 2014, A modified geosite assessment model (M-GAM) and its application on the lazar canyon area (serbia), International journal of environmental research, 8, 1041
Guisan, 2002, Generalized linear and generalized additive models in studies of species distributions: setting the scene, Ecological Modelling, 157, 89, 10.1016/S0304-3800(02)00204-1
Rothery, 2001, Application of generalized additive models to butterfly transect count data, Journal of Applied Statistics, 28, 897, 10.1080/02664760120074979
Pierrot, 2011, Short-term electricity load forecasting with generalized additive models, 410
Griffiths, 2008
Neelon, 2010, A bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use, Statistical modelling, 10, 421, 10.1177/1471082X0901000404
McAllister, 1998, Bayesian stock assessment: a review and example application using the logistic model, ICES Journal of Marine Science, 55, 1031, 10.1006/jmsc.1998.0425
Synnaeve, 2011, A bayesian model for opening prediction in RTS games with application to starcraft, 281
Min, 2007, Probabilistic climate change predictions applying bayesian model averaging, Philosophical transactions of the royal society of london a: mathematical, physical and engineering sciences, 365, 2103
Koop, 2007
Cassandra, 1996, Acting under uncertainty: Discrete bayesian models for mobile-robot navigation, 2, 963
Chipman, 1998, Bayesian cart model search, Journal of the American Statistical Association, 93, 935, 10.1080/01621459.1998.10473750
Kim, 2014, The bayesian case model: A generative approach for case-based reasoning and prototype classification, 1952
Kim, 2016, Examples are not enough, learn to criticize! criticism for interpretability, 2280
Johansson, 2004, Accuracy vs. comprehensibility in data mining models, 1, 295
Konig, 2008, G-rex: A versatile framework for evolutionary data mining, 971
H. Lakkaraju, E. Kamar, R. Caruana, J. Leskovec, Interpretable & explorable approximations of black box models, 2017.
Mishra, 2017, Local interpretable model-agnostic explanations for music content analysis., 537
G. Su, D. Wei, K.R. Varshney, D.M. Malioutov, Interpretable two-level boolean rule learning for classification, 2015.
M.T. Ribeiro, S. Singh, C. Guestrin, Nothing else matters: Model-agnostic explanations by identifying prediction invariance, 2016.
Craven, 1996
O. Bastani, C. Kim, H. Bastani, Interpretability via model extraction, 2017.
Hooker, 2004, Discovering additive structure in black box functions, 575
Adler, 2018, Auditing black-box models for indirect influence, Knowledge and Information Systems, 54, 95, 10.1007/s10115-017-1116-3
Koh, 2017, Understanding black-box predictions via influence functions, 1885
Cortez, 2011, Opening black box data mining models using sensitivity analysis, 341
Cortez, 2013, Using sensitivity analysis and visualization techniques to open black box data mining models, Information Sciences, 225, 1, 10.1016/j.ins.2012.10.039
Lundberg, 2017, A unified approach to interpreting model predictions, 4765
Kononenko, 2010, An efficient explanation of individual classifications using game theory, Journal of Machine Learning Research, 11, 1
H. Chen, S. Lundberg, S.-I. Lee, Explaining models by propagating shapley values of local components, 2019.
Dabkowski, 2017, Real time image saliency for black box classifiers, 6967
Henelius, 2014, A peek into the black box: exploring classifiers by randomization, Data mining and knowledge discovery, 28, 1503, 10.1007/s10618-014-0368-8
J. Moeyersoms, B. d’Alessandro, F. Provost, D. Martens, Explaining classification models built on high-dimensional sparse data, 2016.
Baehrens, 2010, How to explain individual classification decisions, Journal of Machine Learning Research, 11, 1803
J. Adebayo, L. Kagal, Iterative orthogonal feature projection for diagnosing bias in black-box models, 2016.
R. Guidotti, A. Monreale, S. Ruggieri, D. Pedreschi, F. Turini, F. Giannotti, Local rule-based explanations of black box decision systems, 2018.
Krishnan, 2017, Palm: Machine learning explanations for iterative debugging, 4
Robnik-Šikonja, 2008, Explaining classifications for individual instances, IEEE Transactions on Knowledge and Data Engineering, 20, 589, 10.1109/TKDE.2007.190734
Ribeiro, 2018, Anchors: High-precision model-agnostic explanations, 1527
Martens, 2014, Explaining data-driven document classifications, MIS Quarterly, 38, 73, 10.25300/MISQ/2014/38.1.04
Chen, 2017, Enhancing transparency and control when drawing data-driven inferences about individuals, Big data, 5, 197, 10.1089/big.2017.0074
Goldstein, 2015, Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation, Journal of Computational and Graphical Statistics, 24, 44, 10.1080/10618600.2014.907095
Casalicchio, 2018, Visualizing the feature importance for black box models, 655
Tolomei, 2017, Interpretable predictions of tree-based ensembles via actionable feature tweaking, 465
Auret, 2012, Interpretation of nonlinear relationships between process variables by use of random forests, Minerals Engineering, 35, 27, 10.1016/j.mineng.2012.05.008
Rajani, 2018, Stacking with auxiliary features for visual question answering, 2217
Rajani, 2018, Ensembling visual explanations, 155
Núñez, 2006, Rule-based learning systems for support vector machines, Neural Processing Letters, 24, 1, 10.1007/s11063-006-9007-8
Chen, 2007, A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue, Artificial Intelligence in Medicine, 41, 161, 10.1016/j.artmed.2007.07.008
Núñez, 2002, Support vector machines with symbolic interpretation, 142
Sollich, 2002, Bayesian methods for support vector machines: Evidence and predictive class probabilities, Machine learning, 46, 21, 10.1023/A:1012489924661
Sollich, 2000, Probabilistic methods for support vector machines, 349
Landecker, 2013, Interpreting individual classifications of hierarchical networks, 32
Jakulin, 2005, Nomograms for visualizing support vector machines, 108
Fu, 1994, Rule generation from neural networks, IEEE Transactions on Systems, Man, and Cybernetics, 24, 1114, 10.1109/21.299696
Towell, 1993, Extracting refined rules from knowledge-based neural networks, Machine Learning, 13, 71, 10.1007/BF00993103
Thrun, 1994, Extracting rules from artificial neural networks with distributed representations, 505
Setiono, 2000, FERNN: An algorithm for fast extraction of rules from neural networks, Applied Intelligence, 12, 15, 10.1023/A:1008307919726
Taha, 1999, Symbolic interpretation of artificial neural networks, IEEE Transactions on Knowledge and Data Engineering, 11, 448, 10.1109/69.774103
Tsukimoto, 2000, Extracting rules from trained neural networks, IEEE Transactions on Neural Networks, 11, 377, 10.1109/72.839008
Zilke, 2016, Deepred–rule extraction from deep neural networks, 457
Schmitz, 1999, ANN-DT: an algorithm for extraction of decision trees from artificial neural networks, IEEE Transactions on Neural Networks, 10, 1392, 10.1109/72.809084
Sato, 2001, Rule extraction from neural networks via decision tree induction, 3, 1870
Féraud, 2002, A methodology to explain neural network classification, Neural networks, 15, 237, 10.1016/S0893-6080(01)00127-7
A. Shrikumar, P. Greenside, A. Kundaje, Learning Important Features Through Propagating Activation Differences, 2017.
Sundararajan, 2017, Axiomatic attribution for deep networks, 70, 3319
J. Adebayo, J. Gilmer, I. Goodfellow, B. Kim, Local explanation methods for deep neural networks lack sensitivity to parameter values, 2018.
N. Papernot, P. McDaniel, Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning, 2018.
Tan, 2015, Improving the interpretability of deep neural networks with stimulated learning, 617
L. Rieger, C. Singh, W.J. Murdoch, B. Yu, Interpretations are useful: penalizing explanations to align neural networks with prior knowledge, 2019.
Nguyen, 2016, Synthesizing the preferred inputs for neurons in neural networks via deep generator networks, 3387
Li, 2016, Convergent learning: Do different neural networks learn the same representations?
Liu, 2016, Towards better analysis of deep convolutional neural networks, IEEE transactions on visualization and computer graphics, 23, 91, 10.1109/TVCG.2016.2598831
Y. Goyal, A. Mohapatra, D. Parikh, D. Batra, Towards transparent AI systems: Interpreting visual question answering models, 2016.
K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising image classification models and saliency maps, 2013.
Nguyen, 2015, Deep neural networks are easily fooled: High confidence predictions for unrecognizable images, 427
Donahue, 2015, Long-term recurrent convolutional networks for visual recognition and description, 2625
M. Lin, Q. Chen, S. Yan, Network in network, 2013.
L.A. Hendricks, Z. Akata, M. Rohrbach, J. Donahue, B. Schiele, T. Darrell, Generating Visual Explanations, 2016.
Wang, 2017, Residual attention network for image classification, 3156
Xiao, 2015, The application of two-level attention models in deep convolutional neural network for fine-grained image classification, 842
Q. Zhang, R. Cao, Y. Nian Wu, S.-C. Zhu, Growing Interpretable Part Graphs on ConvNets via Multi-Shot Learning, 2016.
L. Arras, G. Montavon, K.-R. Müller, W. Samek, Explaining recurrent neural network predictions in sentiment analysis, 2017.
A. Karpathy, J. Johnson, L. Fei-Fei, Visualizing and understanding recurrent networks, 2015.
Clos, 2017, Towards explainable text classification by jointly learning lexicon and modifier terms, 19
S. Wisdom, T. Powers, J. Pitton, L. Atlas, Interpretable recurrent neural networks using sequential sparse recovery, 2016.
V. Krakovna, F. Doshi-Velez, Increasing the interpretability of recurrent neural networks using hidden markov models, 2016.
Choi, 2016, Retain: An interpretable predictive model for healthcare using reverse time attention mechanism, 3504
Breiman, 2017
A. Lucic, H. Haned, M. de Rijke, Explaining predictions from tree-based boosting ensembles, 2019.
S.M. Lundberg, G.G. Erion, S.-I. Lee, Consistent individualized feature attribution for tree ensembles, 2018.
Buciluǎ, 2006, Model compression, 535
R. Traoré, H. Caselles-Dupré, T. Lesort, T. Sun, G. Cai, N.D. Rodríguez, D. Filliat, DisCoRL: Continual reinforcement learning via policy distillation, 2019.
Zeiler, 2011, Adaptive deconvolutional networks for mid and high level feature learning., 1, 6
Selvaraju, 2017, Grad-cam: Visual explanations from deep networks via gradient-based localization, 618
Adebayo, 2018, Sanity checks for saliency maps, 9505
Z. Che, S. Purushotham, R. Khemani, Y. Liu, Distilling knowledge from deep networks with applications to healthcare domain, 2015.
Donadello, 2017, Logic tensor networks for semantic image interpretation, Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI, 1596
Donadello, 2018
A.S. d’Avila Garcez, M. Gori, L.C. Lamb, L. Serafini, M. Spranger, S.N. Tran, Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning, 2019.
Manhaeve, 2018, DeepProbLog: Neural probabilistic logic programming, 3749
Donadello, 2019, Persuasive explanation of reasoning inferences on dietary data
R.G. Krishnan, U. Shalit, D. Sontag, Deep Kalman Filters, 2015.
M. Karl, M. Soelch, J. Bayer, P. van der Smagt, Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data, 2016.
Johnson, 2016, Composing graphical models with neural networks for structured representations and fast inference, 2946
Zheng, 2015, Conditional random fields as recurrent neural networks, 1529
Narodytska, 2018, Learning optimal decision trees with SAT, 1362
Loyola-González, 2019, Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view, IEEE Access, 7, 154096, 10.1109/ACCESS.2019.2949286
F. Petroni, T. Rocktäschel, P. Lewis, A. Bakhtin, Y. Wu, A.H. Miller, S. Riedel, Language models as knowledge bases?, 2019.
Bollacker, 2019, Extending knowledge graphs with subjective influence networks for personalized fashion, 203
W. Shang, A. Trott, S. Zheng, C. Xiong, R. Socher, Learning world graphs to accelerate hierarchical reinforcement learning, 2019.
Zolotas, 2019
M. Garnelo, K. Arulkumaran, M. Shanahan, Towards deep symbolic reinforcement learning, 2016.
Bellini, 2018, Knowledge-aware autoencoders for explainable recommender systems, 24
C.-Z. A. Huang, A. Vaswani, J. Uszkoreit, N. Shazeer, C. Hawthorne, A.M. Dai, M.D. Hoffman, D. Eck, Music transformer: Generating music with long-term structure, 2018.
M. Cornia, L. Baraldi, R. Cucchiara, Smart: Training shallow memory-aware transformers for robotic explainability, 2019.
Aamodt, 1994, Case-based reasoning: Foundational issues, Methodological Variations, and System Approaches, 7, 39
Caruana, 2000, Case-based explanation for artificial neural nets, 303
M.T. Keane, E.M. Kenny, The Twin-System Approach as One Generic Solution for XAI: An Overview of ANN-CBR Twins for Explaining Deep Learning, 2019.
T. Hailesilassie, Rule extraction algorithm for deep neural networks: A review, 2016.
Benitez, 1997, Are artificial neural networks black boxes?, IEEE Trans. Neural Networks, 8, 1156, 10.1109/72.623216
Johansson, 2005, Automatically balancing accuracy and comprehensibility in predictive modeling, 2, 7pp.
D. Smilkov, N. Thorat, B. Kim, F. Viégas, M. Wattenberg, SmoothGrad: removing noise by adding noise, 2017.
M. Ancona, E. Ceolini, C. Öztireli, M. Gross, Towards better understanding of gradient-based attribution methods for Deep Neural Networks, 2017.
J. Yosinski, J. Clune, Y. Bengio, H. Lipson, How transferable are features in deep neural networks?, 2014.
A. Sharif Razavian, H. Azizpour, J. Sullivan, S. Carlsson, CNN Features off-the-shelf: an Astounding Baseline for Recognition, 2014.
Du, 2017, Self-driving car steering angle prediction based on image recognition
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba, Object Detectors Emerge in Deep Scene CNNs, 2014.
Y. Zhang, X. Chen, Explainable Recommendation: A Survey and New Perspectives, 2018.
J. Frankle, M. Carbin, The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, 2018.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention Is All You Need, 2017.
Lu, 2016, Hierarchical question-image co-attention for visual question answering, 289
A. Das, H. Agrawal, C.L. Zitnick, D. Parikh, D. Batra, Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?, 2016.
D. Huk Park, L.A. Hendricks, Z. Akata, A. Rohrbach, B. Schiele, T. Darrell, M. Rohrbach, Multimodal Explanations: Justifying Decisions and Pointing to the Evidence, 2018.
A. Slavin Ross, M.C. Hughes, F. Doshi-Velez, Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations, 2017.
Hyvärinen, 2000, Oja, e.: Independent component analysis: Algorithms and applications. neural networks 13(4-5), 411-430, Neural networks, 13, 411, 10.1016/S0893-6080(00)00026-5
Berry, 2007, Algorithms and applications for approximate nonnegative matrix factorization, Computational Statistics & Data Analysis, 52, 155, 10.1016/j.csda.2006.11.006
D.P. Kingma, M. Welling, Auto-Encoding Variational Bayes, 2013.
Higgins, 2017, beta-vae: Learning basic visual concepts with a constrained variational framework
X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets, 2016.
Q. Zhang, Y. Yang, Y. Liu, Y. Nian Wu, S.-C. Zhu, Unsupervised Learning of Neural Networks to Explain Neural Networks, 2018.
S. Sabour, N. Frosst, G. E Hinton, Dynamic Routing Between Capsules, 2017.
A. Agrawal, J. Lu, S. Antol, M. Mitchell, C.L. Zitnick, D. Batra, D. Parikh, VQA: Visual Question Answering, 2015.
A. Fukui, D. Huk Park, D. Yang, A. Rohrbach, T. Darrell, M. Rohrbach, Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding, 2016.
D. Bouchacourt, L. Denoyer, EDUCE: explaining model decisions through unsupervised concepts extraction, 2019.
Hofer, 2006, Design and Implementation of a Backward-In-Time Debugger, P-88, 17
Diez-Olivan, 2019, Data fusion and machine learning for industrial prognosis: Trends and perspectives towards Industry 4.0, Information Fusion, 50, 92, 10.1016/j.inffus.2018.10.005
R.R. Hoffman, S.T. Mueller, G. Klein, J. Litman, Metrics for explainable ai: Challenges and prospects, 2018.
S. Mohseni, N. Zarei, E.D. Ragan, A multidisciplinary survey and framework for design and evaluation of explainable ai systems, 2018.
Byrne, 2019, Counterfactuals in explainable artificial intelligence (XAI): Evidence from human reasoning, 6276
Garnelo, 2019, Reconciling deep learning with symbolic artificial intelligence: representing objects and relations, Current Opinion in Behavioral Sciences, 29, 17, 10.1016/j.cobeha.2018.12.010
G. Marra, F. Giannini, M. Diligenti, M. Gori, Integrating learning and reasoning with deep logic models, 2019.
Kelley, 2003, Good practice in the conduct and reporting of survey research, International Journal for Quality in Health Care, 15, 261, 10.1093/intqhc/mzg031
Wachter, 2017, Why a right to explanation of automated decision-making does not exist in the general data protection regulation, International Data Privacy Law, 7, 76, 10.1093/idpl/ipx005
Oh, 2019, Towards reverse-engineering black-box neural networks, 121
I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, 2014.
K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, D. Song, Robust physical-world attacks on deep learning models, 2017.
I.J. Goodfellow, N. Papernot, P.D. McDaniel, cleverhans v0.1: an adversarial machine learning library, 2016.
Xiao, 2015, Support vector machines under adversarial label contamination, Neurocomputing, 160, 53, 10.1016/j.neucom.2014.08.081
Biggio, 2013, Evasion attacks against machine learning at test time, 387
B. Biggio, I. Pillai, S.R. Bulò, D. Ariu, M. Pelillo, F. Roli, Is data clustering in adversarial settings secure?, 2018.
Pan, 2019, Recent progress on generative adversarial networks (gans): A survey, IEEE Access, 7, 36322, 10.1109/ACCESS.2019.2905015
Charte, 2018, A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines, Information Fusion, 44, 78, 10.1016/j.inffus.2017.12.007
Baumgartner, 2018, Visual feature attribution using wasserstein gans, 8309
Biffi, 2018, Learning interpretable anatomical features through deep generative models: Application to cardiac remodeling, 464
S. Liu, B. Kailkhura, D. Loveland, Y. Han, Generative counterfactual introspection for explainable deep learning, 2019.
Varshney, 2017, On the safety of machine learning: Cyber-physical systems, decision sciences, and data products, Big data, 5, 246, 10.1089/big.2016.0051
Weiss, 2004, Mining with rarity: a unifying framework, ACM Sigkdd Explorations Newsletter, 6, 7, 10.1145/1007730.1007734
Attenberg, 2015, Beat the machine: Challenging humans to find a predictive model’s “unknown unknowns”, Journal of Data and Information Quality (JDIQ), 6, 1, 10.1145/2700832
Neff, 2017, Critique and contribute: A practice-based framework for improving critical data studies and data science, Big data, 5, 85, 10.1089/big.2016.0050
Iliadis, 2016, Critical data studies: An introduction, Big Data & Society, 3, 10.1177/2053951716674238
Karpatne, 2017, Theory-guided data science: A new paradigm for scientific discovery from data, IEEE Transactions on Knowledge and Data Engineering, 29, 2318, 10.1109/TKDE.2017.2720168
Hautier, 2010, Finding nature’s missing ternary oxide compounds using machine learning and density functional theory, Chemistry of Materials, 22, 3762, 10.1021/cm100795d
Fischer, 2006, Predicting crystal structure by merging data mining with quantum mechanics, Nature materials, 5, 641, 10.1038/nmat1691
Curtarolo, 2013, The high-throughput highway to computational materials design, Nature materials, 12, 191, 10.1038/nmat3568
Wong, 2009, Active model with orthotropic hyperelastic material for cardiac image analysis, 229
Xu, 2015, Robust transmural electrophysiological imaging: Integrating sparse and dynamic physiological models into ecg-based inference, 519
T. Lesort, M. Seurin, X. Li, N. Díaz-Rodríguez, D. Filliat, Unsupervised state representation learning with robotic priors: a robustness benchmark, 2017.
Leibo, 2017, View-tolerant face recognition and hebbian learning imply mirror-symmetric neural tuning to head orientation, Current Biology, 27, 62, 10.1016/j.cub.2016.10.015
Schrodt, 2015, Bhpmf–a hierarchical bayesian approach to gap-filling and trait prediction for macroecology and functional biogeography, Global Ecology and Biogeography, 24, 1510, 10.1111/geb.12335
Leslie, 2019
Rudin, 2018
J. Fjeld, H. Hilligoss, N. Achten, M.L. Daniel, J. Feldman, S. Kagay, Principled artificial intelligence: A map of ethical and rights-based approaches, 2019.
R. Benjamins, A. Barbado, D. Sierra, Responsible AI by design, 2019.
United-Nations, 2015, Transforming our World: the 2030 Agenda for Sustainable Development
G.D. Hager, A. Drobnis, F. Fang, R. Ghani, A. Greenwald, T. Lyons, D.C. Parkes, J. Schultz, S. Saria, S.F. Smith, M. Tambe, Artificial intelligence for social good, 2019.
Stahl, 2018, Ethics and privacy in ai and big data: Implementing responsible research and innovation, IEEE Security & Privacy, 16, 26, 10.1109/MSP.2018.2701164
High Level Expert Group on Artificial Intelligence, 2019, Ethics Guidelines for Trustworthy AI
d’Alessandro, 2017, Conscientious classification: A data scientist’s guide to discrimination-aware classification, Big data, 5, 120, 10.1089/big.2016.0048
Barocas, 2016, Big data’s disparate impact, Calif. L. Rev., 104, 671
Hardt, 2016, Equality of opportunity in supervised learning, 3315
Speicher, 2018, A unified approach to quantifying algorithmic unfairness: Measuring individual group unfairness via inequality indices, 2239
Kamiran, 2012, Data preprocessing techniques for classification without discrimination, Knowledge and Information Systems, 33, 1, 10.1007/s10115-011-0463-8
Zemel, 2013, Learning fair representations, 325
Zhang, 2018, Mitigating unwanted biases with adversarial learning, 335
Ahn, 2019, Fairsight: Visual analytics for fairness in decision making, IEEE transactions on visualization and computer graphics, 10.1109/TVCG.2019.2934262
Soares, 2019, Fair-by-design explainable models for prediction of recidivism, arXiv preprint arXiv:1910.02043
Dressel, 2018, The accuracy, fairness, and limits of predicting recidivism, Science advances, 4, eaao5580, 10.1126/sciadv.aao5580
Aivodji, 2019, Fairwashing: the risk of rationalization, 161
Sharma, 2019, Certifai: Counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models, arXiv preprint arXiv:1905.07857
Lerman, 2013, Big data and its exclusions, Stan. L. Rev. Online, 66, 55
Agrawal, 2009, Diversifying search results, 5
Smyth, 2001, Similarity vs. diversity, 347
Wang, 2019, Data fusion in cyber-physical-social systems: State-of-the-art and perspectives, Information Fusion, 51, 42, 10.1016/j.inffus.2018.11.002
Ding, 2019, A survey on data fusion in internet of things: Towards secure and privacy-preserving fusion, Information Fusion, 51, 129, 10.1016/j.inffus.2018.12.001
Smirnov, 2019, Knowledge fusion patterns: A survey, Information Fusion, 52, 31, 10.1016/j.inffus.2018.11.007
Ding, 2019, A survey on data fusion in internet of things: Towards secure and privacy-preserving fusion, Information Fusion, 51, 129, 10.1016/j.inffus.2018.12.001
Wang, 2019, Data fusion in cyber-physical-social systems: State-of-the-art and perspectives, Information Fusion, 51, 42, 10.1016/j.inffus.2018.11.002
Lau, 2019, A survey of data fusion in smart city applications, Information Fusion, 52, 357, 10.1016/j.inffus.2019.05.004
Ramírez-Gallego, 2018, Big data: Tutorial and guidelines on information and process fusion for analytics algorithms with mapreduce, Information Fusion, 42, 51, 10.1016/j.inffus.2017.10.001
J. Konečný, H.B. McMahan, D. Ramage, P. Richtárik, Federated optimization: Distributed machine learning for on-device intelligence, 2016.
McMahan, 2017, Communication-efficient learning of deep networks from decentralized data, 1273
J. Konečnỳ, H.B. McMahan, F.X. Yu, P. Richtárik, A.T. Suresh, D. Bacon, Federated learning: Strategies for improving communication efficiency, 2016.
Sun, 2013, A survey of multi-view machine learning, Neural computing and applications, 23, 2031, 10.1007/s00521-013-1362-6
Zhang, 2019, Feature selection with multi-view data: A survey, Information Fusion, 50, 158, 10.1016/j.inffus.2018.11.019
Zhao, 2017, Multi-view learning overview: Recent progress and new challenges, Information Fusion, 38, 43, 10.1016/j.inffus.2017.02.007
Oh, 2016, Faceless person recognition: Privacy implications in social media, 19
Aditya, 2016, I-pic: A platform for privacy-compliant image capture, 235
Sun, 2018, A hybrid model for identity obfuscation by face replacement, 553
Dong, 2013, Big data integration, 1245
Zhang, 2015, comobile: Real-time human mobility modeling at urban scale using multi-view learning, 40
Pan, 2009, A survey on transfer learning, IEEE Transactions on knowledge and data engineering, 22, 1345, 10.1109/TKDE.2009.191
Mitchell, 2019, Model cards for model reporting, 220