Techniques for interpretable machine learning

Communications of the ACM - Tập 63 Số 1 - Trang 68-77 - 2019
Mengnan Du1, Ninghao Liu1, Xia Hu1
1Texas A&M University, College Station, TX

Tóm tắt

Uncovering the mysterious ways machine learning models make decisions.

Từ khóa


Tài liệu tham khảo

10.1093/bioinformatics/btq134

Ancona , M. , Ceolini , E. , Oztireli , C. and Gross , M . Towards better understanding of gradient-based attribution methods for deep neural networks . In Proceedings of the Intern. Conf. Learning Representations , 2018 . Ancona, M., Ceolini, E., Oztireli, C. and Gross, M. Towards better understanding of gradient-based attribution methods for deep neural networks. In Proceedings of the Intern. Conf. Learning Representations, 2018.

10.1371/journal.pone.0130140

Bahdanau , D. , Cho , K. and Bengio , Y . Neural machine translation by jointly learning to align and translate . In Proceedings of the Intern. Conf. Learning Representations , 2015 . Bahdanau, D., Cho, K. and Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proceedings of the Intern. Conf. Learning Representations, 2015.

Bastani , O. , Kim , C. , and Bastani , H . Interpretability via model extraction . In Proceedings of the Fairness, Accountability, and Transparency in Machine Learning Workshop , 2017 . Bastani, O., Kim, C., and Bastani, H. Interpretability via model extraction. In Proceedings of the Fairness, Accountability, and Transparency in Machine Learning Workshop, 2017.

10.1145/2783258.2788613

10.1145/2939672.2939785

Dabkowski , P. and Gal , Y . Real time image saliency for black box classifiers. Advances in Neural Information Processing Systems ( 2017 ), 6970--6979. Dabkowski, P. and Gal, Y. Real time image saliency for black box classifiers. Advances in Neural Information Processing Systems (2017), 6970--6979.

Dix , A. Human issues in the use of pattern recognition techniques. Neural Networks and Pattern Recognition in Human Computer Interaction ( 1992 ), 429--451. Dix, A. Human issues in the use of pattern recognition techniques. Neural Networks and Pattern Recognition in Human Computer Interaction (1992), 429--451.

Doshi-Velez , F. and Kim , B . Towards a rigorous science of interpretable machine learning . 2017 . Doshi-Velez, F. and Kim, B. Towards a rigorous science of interpretable machine learning. 2017.

10.1145/3219819.3220099

10.1145/3308558.3313545

10.1109/ICCV.2017.371

10.1145/2594473.2594475

10.5555/3086952

Goodfellow , I.J. , Shlens , J. and Szegedy , C . Explaining and harnessing adversarial examples . In Proceedings of the Intern. Conf. Learning Representations , 2015 . Goodfellow, I.J., Shlens, J. and Szegedy, C. Explaining and harnessing adversarial examples. In Proceedings of the Intern. Conf. Learning Representations, 2015.

10.1162/COLI_a_00300

Karpathy , A. , Johnson , J. , and Fei-Fei , L . Visualizing and understanding recurrent networks . In Proceedings of the ICLR Workshop , 2016 . Karpathy, A., Johnson, J., and Fei-Fei, L. Visualizing and understanding recurrent networks. In Proceedings of the ICLR Workshop, 2016.

10.1145/3289600.3290960

10.1145/3219819.3220027

10.1007/978-1-4899-3242-6

Miller , T. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence ( 2018 ). Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence (2018).

Molnar C. Interpretable Machine Learning (2018); https://christophm.github.io/interpretable-ml-book/. Molnar C. Interpretable Machine Learning (2018); https://christophm.github.io/interpretable-ml-book/.

10.18653/v1/P18-1176

Nguyen , A. , Dosovitskiy , A. , Yosinski , J. , Brox , T. and Clune , J . Synthesizing the preferred inputs for neurons in neural networks via deep generator networks . Advances in Neural Information Processing Systems , 2016 . Nguyen, A., Dosovitskiy, A., Yosinski, J., Brox, T. and Clune, J. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. Advances in Neural Information Processing Systems, 2016.

10.1109/CVPR.2015.7298640

Nguyen , A. , Yosinski , J. and Clune , J . Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks . In Proceedings of the ICLR Workshop , 2016 . Nguyen, A., Yosinski, J. and Clune, J. Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks. In Proceedings of the ICLR Workshop, 2016.

Peters , M.E. et al. Deep contextualized word representations . In Proceedings of the NAACL-HLT , 2018 . Peters, M.E. et al. Deep contextualized word representations. In Proceedings of the NAACL-HLT, 2018.

10.1016/S0020-7373(87)80053-6

10.18653/v1/N16-3020

Ribeiro , M.T. , Singh , S. and Guestrin , C . Anchors: High-precision model-agnostic explanations . In Proceedings of the AAAI Conf. Artificial Intelligence , 2018 . Ribeiro, M.T., Singh, S. and Guestrin, C. Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI Conf. Artificial Intelligence, 2018.

Sabour , S. , Frosst , N. and Hinton , G.E . Dynamic routing between capsules . Advances in Neural Information Processing Systems , 2017 . Sabour, S., Frosst, N. and Hinton, G.E. Dynamic routing between capsules. Advances in Neural Information Processing Systems, 2017.

Simonyan , K. , Vedaldi , A. and Zisserman , A . Deep inside convolutional networks: Visualising image classification models and saliency maps . In Proceedings of the ICLR Workshop , 2014 . Simonyan, K., Vedaldi, A. and Zisserman, A. Deep inside convolutional networks: Visualising image classification models and saliency maps. In Proceedings of the ICLR Workshop, 2014.

Springenberg , J.T. , Dosovitskiy , A. , Brox , T. and Riedmiller , M . Striving for simplicity: The all convolutional net . In Proceedings of the ICLR workshop , 2015 . Springenberg, J.T., Dosovitskiy, A., Brox, T. and Riedmiller, M. Striving for simplicity: The all convolutional net. In Proceedings of the ICLR workshop, 2015.

Tomsett , R. , Braines , D. , Harborne , D. , Preece , A. and Chakraborty , S . Interpretable to whom? A role-based model for analyzing interpretable machine learning systems . In Proceedings of the ICML Workshop on Human Interpretability in Machine Learning , 2018 . Tomsett, R., Braines, D., Harborne, D., Preece, A. and Chakraborty, S. Interpretable to whom? A role-based model for analyzing interpretable machine learning systems. In Proceedings of the ICML Workshop on Human Interpretability in Machine Learning, 2018.

Vandewiele , G. , Janssens , G. , Ongenae , O. , and Van Hoecke , F.S. Genesim : Genetic extraction of a single, interpretable model . In Proceedings of the NIPS Workshop , 2016 . Vandewiele, G., Janssens, G., Ongenae, O., and Van Hoecke, F.S. Genesim: Genetic extraction of a single, interpretable model. In Proceedings of the NIPS Workshop, 2016.

Wachter , S. , Mittelstadt , B. and Russell , C . Counterfactual explanations without opening the black box: Automated decisions and the GDPR . 2017 . Wachter, S., Mittelstadt, B. and Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. 2017.

Xu , K. et al. Show, attend and tell: Neural image caption generation with visual attention . In Proceedings of the Intern. Conf. Machine Learning , 2015 . Xu, K. et al. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the Intern. Conf. Machine Learning, 2015.

10.1109/CVPR.2018.00920

Zhou , B. , Khosla , A. , Lapedriza , A. , Oliva , A. and Torralba , A . Object detectors emerge in deep scene CNNs . In Proceedings of the Intern. Conf. Learning Representations , 2015 . Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. and Torralba, A. Object detectors emerge in deep scene CNNs. In Proceedings of the Intern. Conf. Learning Representations, 2015.