Gradient boosted trees with individual explanations: An alternative to logistic regression for viability prediction in the first trimester of pregnancy

Computer Methods and Programs in Biomedicine - Tập 213 - Trang 106520 - 2022
Thibaut Vaulet1, Maya Al-Memar2, Hanine Fourie2, Shabnam Bobdiwala2, Srdjan Saso2, Maria Pipi2, Catriona Stalder2, Phillip Bennett2, Dirk Timmerman3,4, Tom Bourne2,3,4, Bart De Moor1
1ESAT-STADIUS, Stadius Centre for Dynamical Systems, Signal Processing and Data Analytics (STADIUS), Leuven (Arenberg) Kasteelpark Arenberg 10 - box 2446, Leuven 3001, Belgium
2Tommy's National Early Miscarriage Research Centre, Queen Charlotte's and Chelsea Hospital, Imperial College, Du Cane Road, London W12 0HS, United Kingdom
3Department of Development and Regeneration, KU Leuven, Leuven, Belgium
4Department of obstetrics and gynecology, University Hospitals Leuven, Leuven, Belgium

Tài liệu tham khảo

Magnus, 2019, Role of maternal age and pregnancy history in risk of miscarriage: prospective register based study, BMJ, 364 Rossen, 2018, Trends in risk of pregnancy loss among US Women, 1990–2011, Paediatr. Perinat. Epidemiol., 32, 19, 10.1111/ppe.12417 Foo, 2020, Peri-implantation urinary hormone monitoring distinguishes between types of first-trimester spontaneous pregnancy loss, Paediatr. Perinat. Epidemiol., 34, 495, 10.1111/ppe.12613 Geller, 2004, Anxiety following miscarriage and the subsequent pregnancy: a review of the literature and future directions, J. Psychosom. Res., 56, 35, 10.1016/S0022-3999(03)00042-4 Farren, 2020, Posttraumatic stress, anxiety and depression following miscarriage and ectopic pregnancy: a multicenter, prospective, cohort study, Am. J. Obstet. Gynecol., 222, 367e1, 10.1016/j.ajog.2019.10.102 Farren, 2018, The psychological impact of early pregnancy loss, Hum. Reprod. Update, 24, 731, 10.1093/humupd/dmy025 Richardson, 2017, Anxiety associated with diagnostic uncertainty in early pregnancy, Ultrasound Obstet. Gynecol., 50, 247, 10.1002/uog.17214 Detti, 2020, Early pregnancy ultrasound measurements and prediction of first trimester pregnancy loss: a logistic model, Sci. Rep., 10, 1545, 10.1038/s41598-020-58114-3 Choong, 2003, Ultrasound prediction of risk of spontaneous miscarriage in live embryos from assisted conceptions, Ultrasound Obstet. Gynecol., 22, 571, 10.1002/uog.909 Elson, 2003, Prediction of early pregnancy viability in the absence of an ultrasonically detectable embryo, Ultrasound Obstet. Gynecol., 21, 57, 10.1002/uog.1 Lautmann, 2011, Clinical use of a model to predict the viability of early intrauterine pregnancies when no embryo is visible on ultrasound, Hum. Reprod., 26, 2957, 10.1093/humrep/der287 Guha, 2013, External validation of models and simple scoring systems to predict miscarriage in intrauterine pregnancies of uncertain viability, Hum. Reprod., 28, 2905, 10.1093/humrep/det342 Bignardi, 2010, Viability of intrauterine pregnancy in women with pregnancy of unknown location: prediction using human chorionic gonadotropin ratio vs. progesterone, Ultrasound Obstet. Gynecol., 35, 656, 10.1002/uog.7669 Esteva, 2017, Dermatologist-level classification of skin cancer with deep neural networks, Nature, 542, 115, 10.1038/nature21056 Senior, 2020, Improved protein structure prediction using potentials from deep learning, Nature, 577, 706, 10.1038/s41586-019-1923-7 Brown, 2020, Language models are few-shot learners, 1877 Chen, 2016, XGBoost: a Scalable Tree Boosting System, 785 Liu, 2020, Machine learning algorithms to predict early pregnancy loss after in vitro fertilization-embryo transfer with fetal heart rate as a strong predictor, Comput. Methods Programs Biomed., 196, 10.1016/j.cmpb.2020.105624 Moreira, 2019, Averaged one-dependence estimators on edge devices for smart pregnancy data analysis, Comput. Electr. Eng., 77, 435, 10.1016/j.compeleceng.2018.07.041 Bruno, 2020, Machine learning (ML) based-method applied in recurrent pregnancy loss (RPL) patients diagnostic work-up: a potential innovation in common clinical practice, Sci. Rep., 10, 7970, 10.1038/s41598-020-64512-4 Kuhle, 2018, Comparison of logistic regression with machine learning methods for the prediction of fetal growth abnormalities: a retrospective cohort study, BMC Pregnancy Childbirth, 18, 333, 10.1186/s12884-018-1971-2 Poon, 2018, The first-trimester of pregnancy-a window of opportunity for prediction and prevention of pregnancy complications and future life, Diabetes Res. Clin. Pract., 145, 20, 10.1016/j.diabres.2018.05.002 He, 2019, The practical implementation of artificial intelligence technologies in medicine, Nat. Med., 25, 30, 10.1038/s41591-018-0307-0 2018, Towards trustable machine learning, Nat. Biomed. Eng., 2, 709, 10.1038/s41551-018-0315-x Lundberg, 2017, A unified approach to interpreting model predictions, Adv. Neural Inf. Process. Syst., 30 Al-Memar, 2019, Early-pregnancy events and subsequent antenatal, delivery and neonatal outcomes: prospective cohort study, Ultrasound Obstet. Gynecol., 54, 530, 10.1002/uog.20262 Bottomley, 2013, Accurate prediction of pregnancy viability by means of a simple scoring system, Hum. Reprod., 28, 68, 10.1093/humrep/des352 Azur, 2011, Multiple imputation by chained equations: what is it and how does it work?, Int. J. Methods Psychiatr. Res., 20, 40, 10.1002/mpr.329 Friedman, 2001, Greedy function approximation: a gradient boosting machine, Ann. Stat., 29, 1189, 10.1214/aos/1013203451 Lundberg, 2020, From local explanations to global understanding with explainable AI for trees, Nat. Mach. Intell., 2, 56, 10.1038/s42256-019-0138-9 Lundberg, 2017, 15 Ke, 2017, A Highly Efficient Gradient Boosting Decision Tree, Advances in Neural Information Processing Systems, 30, 3146 J.S. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter optimization, in: Proceedings of the 24th International Conference on Neural Information Processing Systems, 2011, pp. 2546–2554. Steyerberg, 2019 DeLong, 1988, Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach, Biometrics, 44, 837, 10.2307/2531595 Miller, 2019, Explanation in artificial intelligence: insights from the social sciences, Artif. Intell., 267, 1, 10.1016/j.artint.2018.07.007 Janzing, 2020, Feature relevance quantification in explainable AI: a causal problem, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR, 2907 Sundararajan, 2020, The many shapley values for model explanation, 9269 Christodoulou, 2019, A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models, J. Clin. Epidemiol., 110, 12, 10.1016/j.jclinepi.2019.02.004 Lynam, 2020, Logistic regression has similar performance to optimised machine learning algorithms in a clinical setting: application to the discrimination between type 1 and type 2 diabetes in young adults, Diagn. Progn. Res., 4, 6, 10.1186/s41512-020-00075-2 Breiman, 2001, Statistical modeling: the two cultures (with comments and a rejoinder by the author), Stat. Sci., 16, 199, 10.1214/ss/1009213726 Dormann, 2013, Collinearity: a review of methods to deal with it and a simulation study evaluating their performance, Ecography, 36, 27, 10.1111/j.1600-0587.2012.07348.x Harrell, 2015 Wegienka, 2002, A comparison of recalled date of last menstrual period with prospectively recorded dates, J. Womens Health, 14, 248, 10.1089/jwh.2005.14.248