Opening the Black Box: The Promise and Limitations of Explainable Machine Learning in Cardiology

Canadian Journal of Cardiology - Tập 38 Số 2 - Trang 204-213 - 2022
Jeremy Petch1,2,3,4, Shuang Di1,5, William H. Nelson1,6
1Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, Ontario, Canada
2Division of Cardiology, Department of Medicine, McMaster University, Hamilton, Ontario, Canada
3Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
4Population Health Research Institute, Hamilton, Ontario, Canada
5Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
6Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada

Tóm tắt

Từ khóa


Tài liệu tham khảo

Quer, 2021, Machine learning and the future of cardiovascular care: JACC State-of-the-Art Review, J Am Coll Cardiol, 77, 300, 10.1016/j.jacc.2020.11.030

Krittanawong, 2019, Deep learning for cardiovascular medicine: a practical primer, Eur Heart J, 40, 2058, 10.1093/eurheartj/ehz056

Iannattone, 2020, Artificial intelligence for diagnosis of acute coronary syndromes: a meta-analysis of machine learning approaches, Can J Cardiol, 36, 577, 10.1016/j.cjca.2019.09.013

Tu, 1996, Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes, J Clin Epidemiol, 49, 1225, 10.1016/S0895-4356(96)00002-9

Cabitza, 2017, Unintended consequences of machine learning in medicine, JAMA, 318, 517, 10.1001/jama.2017.7797

Vellido, 2020, The importance of interpretability and visualization in machine learning for applications in medicine and health care, Neural Comput Appl, 32, 18069, 10.1007/s00521-019-04051-w

Stiglic, 2020, Interpretability of machine learning-based prediction models in healthcare, Data Mining Knowl Discov, 10, 1

Obermeyer, 2019, Dissecting racial bias in an algorithm used to manage the health of populations, Science, 366, 447, 10.1126/science.aax2342

Benjamin, 2019, Assessing risk, automating racism, Science, 366, 421, 10.1126/science.aaz3873

Guidi, 2015, Clinician perception of the effectiveness of an automated early warning and response system for sepsis in an academic medical center, Ann Am Thorac Soc, 12, 1514, 10.1513/AnnalsATS.201503-129OC

Muralitharan, 2021, Machine learning-based early warning systems for clinical deterioration: systematic scoping review, J Med Internet Res, 23, 10.2196/25187

Umscheid, 2015, Development, implementation, and impact of an automated early warning and response system for sepsis, J Hosp Med, 10, 26, 10.1002/jhm.2259

Rudin

Lahav

Tonekaboni, 2019, What clinicians want: contextualizing explainable machine learning for clinical end use, 106, 359

Carvalho, 2019, Machine learning interpretability: a survey on methods and metrics, Electron, 8, 832, 10.3390/electronics8080832

Reyes, 2020, On the interpretability of artificial intelligence in radiology: challenges and opportunities, Radiol Artif Intell, 2, 10.1148/ryai.2020190043

Elshawi, 2019, On the interpretability of machine learning-based model for predicting hypertension, BMC Med Inform Decis Mak, 19, 146, 10.1186/s12911-019-0874-0

Rudin, 2019, Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nat Mach Intell, 1, 206, 10.1038/s42256-019-0048-x

Van Lent, 2004, An explainable artificial intelligence system for small-unit tactical behavior

Piltaver, 2016, What makes classification trees comprehensible?, Expert Syst Appl, 62, 333, 10.1016/j.eswa.2016.06.009

Miller, 2019, Explanation in artificial intelligence: insights from the social sciences, Artif Intell, 267, 1, 10.1016/j.artint.2018.07.007

Elshawi, 2019, Interpretability in healthcare a comparative study of local machine learning interpretability techniques, 2019, 275

Gilpin, 2018, Explaining explanations: an overview of interpretability of machine learning. In: Proceedings 2018 IEEE 5th International Conference on Data Science and Advanced Analytics, DSAA, 2019, 80

Angelino, 2018, Learning certifiably optimal rule lists for categorical data, J Mach Learn Res, 18, 1

Goodfellow, 2016

Shalev-Shwartz

Goldstein, 2015, Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation, J Comput Graph Stat, 24, 44, 10.1080/10618600.2014.907095

Babic, 2021, Beware explanations from AI in health care, Science, 373, 284, 10.1126/science.abg1834

Molnar

Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015;1721–1730.

Pereira, 2018, Enhancing interpretability of automatically extracted machine learning features: application to a RBM-random forest system on brain lesion segmentation, Med Image Anal, 44, 228, 10.1016/j.media.2017.12.009

Rawshani, 2019, Relative prognostic importance and optimal levels of risk factors for mortality and cardiovascular outcomes in type 1 diabetes mellitus, Circulation, 139, 1900, 10.1161/CIRCULATIONAHA.118.037454

Al-Dury, 2020, Identifying the relative importance of predictors of survival in out of hospital cardiac arrest: a machine learning study, Scand J Trauma Resusc Emerg Med, 28, 60, 10.1186/s13049-020-00742-9

Zhu, 2020, Metagenome-wide association of gut microbiome features for schizophrenia, Nat Commun, 11, 1612, 10.1038/s41467-020-15457-9

Razavian, 2020, A validated, real-time prediction model for favorable outcomes in hospitalized COVID-19 patients, NPJ Digit Med, 3, 130, 10.1038/s41746-020-00343-x

Breiman, 2001, Random forests, Mach Learn, 45, 5, 10.1023/A:1010933404324

Janosi

Gulshan, 2016, Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs, JAMA, 316, 2402, 10.1001/jama.2016.17216

Cuadros, 2009, EyePACS: an adaptable telemedicine system for diabetic retinopathy screening, J Diabetes Sci Technol, 3, 509, 10.1177/193229680900300315

Daghistani, 2019, Predictors of in-hospital length of stay among cardiac patients: a machine learning approach, Int J Cardiol, 288, 140, 10.1016/j.ijcard.2019.01.046

Bhattacharya, 2019, Identifying ventricular arrhythmias and their predictors by applying machine learning methods to electronic health records in patients with hypertrophic cardiomyopathy (HCM-VAr-Risk Model), Am J Cardiol, 123, 1681, 10.1016/j.amjcard.2019.02.022

Avram, 2020, The rise of open-sourced machine learning in small and imbalanced datasets: predicting in-stent restenosis, Can J Cardiol, 36, 1574, 10.1016/j.cjca.2020.02.002

Alaa, 2019, Cardiovascular disease risk prediction using automated machine learning: a prospective study of 423,604 UK Biobank participants, PLoS One, 14, 10.1371/journal.pone.0213653

Altmann, 2010, Permutation importance: a corrected feature importance measure, Bioinformatics, 26, 1340, 10.1093/bioinformatics/btq134

Breiman, 1984

Strobl, 2008, Conditional variable importance for random forests, BMC Bioinformatics, 9, 307, 10.1186/1471-2105-9-307

Jones MY, Deligianni F, Dalton J. Improving ECG classification interpretability using saliency maps. In: Proceedings IEEE 20th International Conference on Bioinformatics and Bioengineering, BIBE 2020;2020:675–682.

Hicks, 2021, Explaining deep neural networks for knowledge discovery in electrocardiogram analysis, Sci Rep, 11, 1, 10.1038/s41598-021-90285-5

Ribeiro, 2016, Why should I trust you? Explaining the predictions of any classifier, 13-17, 1135

Lundberg, 2017, A unified approach to interpreting model predictions, Advances in Neural Information Processing Systems, 4766

Selvaraju, 2020, Grad-CAM: visual explanations from deep networks via gradient-based localization, Int J Comput Vis, 128, 336, 10.1007/s11263-019-01228-7

Springenberg, 2015, Striving for simplicity: the all convolutional net. In: Third International Conference on Learning Representations, ICLR 2015, Workshop Track Proceedings, 1

Sundararajan, 2017, Axiomatic attribution for deep networks. In: 34th International Conference on Machine Learning, ICML, 2017, 5109

Friedman, 2001, Greedy function approximation: a gradient boosting machine, Ann Stat, 29, 1189, 10.1214/aos/1013203451

Apley, 2020, Visualizing the effects of predictor variables in black box supervised learning models, J R Stat Soc Ser B Stat Methodol, 82, 1059, 10.1111/rssb.12377

Schooling, 2018, Clarifying questions about “risk factors”: predictors versus explanation, Emerg Themes Epidemiol, 15, 1, 10.1186/s12982-018-0080-z

Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019-2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies–Proceedings of the Conference. Vol 1. 2019:4171-86. Available at: https://arxiv.org/pdf/1810.04805.pdf. Accessed July 27, 2021.

Aggarwal, 2021, Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis, NPJ Digit Med, 4, 65, 10.1038/s41746-021-00438-z

Christodoulou, 2019, A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models, J Clin Epidemiol, 110, 12, 10.1016/j.jclinepi.2019.02.004