Clinical and Laboratory Approach to Diagnose COVID-19 Using Machine Learning

Krishnaraj Chadaga1, Chinmay Chakraborty2, Srikanth Prabhu1, Shashikiran Umakanth3, Vivekananda Bhat1, Niranjana Sampathila4
1Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
2Department of Electronics and Communication, Birla Institute of Technology, Mesra, India
3Department of Medicine, Dr. TMA Hospital, Manipal Academy of Higher Education, Manipal, India
4Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India

Tóm tắt

Coronavirus 2 (SARS-CoV-2), often known by the name COVID-19, is a type of acute respiratory syndrome that has had a significant influence on both economy and health infrastructure worldwide. This novel virus is diagnosed utilising a conventional method known as the RT-PCR (Reverse Transcription Polymerase Chain Reaction) test. This approach, however, produces a lot of false-negative and erroneous outcomes. According to recent studies, COVID-19 can also be diagnosed using X-rays, CT scans, blood tests and cough sounds. In this article, we use blood tests and machine learning to predict the diagnosis of this deadly virus. We also present an extensive review of various existing machine-learning applications that diagnose COVID-19 from clinical and laboratory markers. Four different classifiers along with a technique called Synthetic Minority Oversampling Technique (SMOTE) were used for classification. Shapley Additive Explanations (SHAP) method was utilized to calculate the gravity of each feature and it was found that eosinophils, monocytes, leukocytes and platelets were the most critical blood parameters that distinguished COVID-19 infection for our dataset. These classifiers can be utilized in conjunction with RT-PCR tests to improve sensitivity and in emergency situations such as a pandemic outbreak that might happen due to new strains of the virus. The positive results indicate the prospective use of an automated framework that could help clinicians and medical personnel diagnose and screen patients.

Tài liệu tham khảo

WHO (2021) Coronavirus disease (covid-19). https://www.who.int/emergencies/diseases/novel-coronavirus-2019. Accessed 18 Dec 2021 Corman VM, Landt O, Kaiser M, Molenkamp R, Meijer A, Chu DK, Bleicker T, Brünink S, Schneider J, Schmidt ML, Mulders DG (2020) Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR. Eurosurveillance 25:2000045. https://doi.org/10.2807/1560-7917.es.2020.25.3.2000045 Döhla M, Boesecke C, Schulte B, Diegmann C, Sib E, Richter E, Eschbach-Bludau M, Aldabbagh S, Marx B, Eis-Hübinger AM, Schmithausen RM (2020) Rapid point-of-care testing for SARS-CoV-2 in a community screening setting shows low sensitivity. Public Health 180:170–172. https://doi.org/10.1016/j.puhe.2020.04.009 Burog AI, Yacapin CP, Maglente RR, Macalalad-Josue AA, Uy EJ, Dans AL, Dans LF (2020) Should IgM/IgG rapid test kit be used in the diagnosis of COVID-19. Asia Pac Center Evid Based Healthc 4:1–12. https://doi.org/10.47895/amp.v54i0.1558 Browning L, Colling R, Rakha E, Rajpoot N, Rittscher J, James JA, Salto-Tellez M, Snead DR, Verrill C (2021) Digital pathology and artificial intelligence will be key to supporting clinical and academic cellular pathology through COVID-19 and future crises: the PathLAKE consortium perspective. J Clin Pathol 74(7):443–447. https://doi.org/10.1136/jclinpath-2020-206854 Chamola V, Hassija V, Gupta V, Guizani M (2020) A comprehensive review of the COVID-19 pandemic and the role of IoT, drones, AI, blockchain, and 5G in managing its impact. Ieee access 8:90225–90265. https://doi.org/10.1109/ACCESS.2020.2992341 Dash S, Chakraborty C, Giri SK, Pani SK (2021) Intelligent computing on time-series data analysis and prediction of COVID-19 pandemics. Pattern Recogn Lett 151:69–75. https://doi.org/10.1016/j.patrec.2021.07.027 Rahman A, Chakraborty C, Anwar A, Karim M, Islam M, Kundu D, Rahman Z, Band SS (2021) SDN–IoT empowered intelligent framework for industry 4.0 applications during COVID-19 pandemic. Clust Comput 29:1–8. https://doi.org/10.1007/s10586-021-03367-4 Chakraborty C, Abougreen AN (2021) Intelligent internet of things and advanced machine learning techniques for COVID-19. EAI Endors Trans Pervasive Health Technol 7:26. https://doi.org/10.4108/eai.28-1-2021.168505 Sajid MR, Muhammad N, Zakaria R, Shahbaz A, Bukhari SA, Kadry S, Suresh A (2021) Nonclinical features in predictive modeling of cardiovascular diseases: a machine learning approach. Interdiscip Sci 13(2):201–211. https://doi.org/10.1007/s12539-021-00423-w Orrù G, Monaro M, Conversano C, Gemignani A, Sartori G (2021) Machine learning in psychometrics and psychological research. Front Psychol 10:2970. https://doi.org/10.3389/fpsyg.2019.02970 Rosenbusch H, Soldner F, Evans AM, Zeelenberg M (2021) Supervised machine learning methods in psychology: a practical introduction with annotated R code. Soc Pers Psychol Compass 15(2):e12579. https://doi.org/10.31234/osf.io/s72vu Dhiman G, Kumar VV, Kaur A, Sharma A (2021) DON: deep learning and optimization-based framework for detection of novel coronavirus disease using X-ray images. Interdiscipl Sci 15:1–3. https://doi.org/10.1007/s12539-021-00418-7 Zheng F, Li L, Zhang X, Song Y, Huang Z, Chong Y, Chen Z, Zhu H, Wu J, Chen W, Lu Y (2021) Accurately discriminating COVID-19 from viral and bacterial pneumonia according to CT images via deep learning. Interdiscip Sci 13(2):273–285. https://doi.org/10.1007/s12539-021-00420-z Rasheed J, Jamil A, Hameed AA, Al-Turjman F, Rasheed A (2021) COVID-19 in the age of artificial intelligence. A comprehensive review. Interdiscip Sci. https://doi.org/10.1007/s12539-021-00431-w Abderrahim E, Xavier D, Zakaria L, Olivier L (2014) Nonlocal infinity Laplacian equation on graphs with applications in image processing and machine learning. Math Comput Simul 102:153–163. https://doi.org/10.1016/j.matcom.2014.01.007 Hindman M (2015) Building better models: prediction, replication, and machine learning in the social sciences. Ann Am Acad Polit Soc Sci 659(1):48–62 Grimmer J, Roberts ME, Stewart BM (2021) Machine learning for social science: an agnostic approach. Annu Rev Polit Sci 24:395–419. https://doi.org/10.1146/annurev-polisci-053119-015921 Chen NC, Drouhard M, Kocielnik R, Suh J, Aragon CR (2018) Using machine learning to support qualitative coding in social science: shifting the focus to ambiguity. ACM Trans Interact Intell Syst 8(2):1–20. https://doi.org/10.1145/3185515 D’Souza S, Prema KV, Balaji S (2020) Machine learning models for drug–target interactions: current knowledge and future directions. Drug Discovery Today 25(4):748–756. https://doi.org/10.1016/j.drudis.2020.03.003 Latif S, Usman M, Manzoor S, Iqbal W, Qadir J, Tyson G, Castro I, Razi A, Boulos MN, Weller A, Crowcroft J (2020) Leveraging data science to combat covid-19: a comprehensive review. IEEE Trans Artif Intell 1(1):85–103. https://doi.org/10.36227/techrxiv.12212516 Pathan S, Siddalingaswamy PC, Kumar P, Manohara Pai MM, Ali T, Acharya UR (2021) Novel ensemble of optimized CNN and dynamic selection techniques for accurate Covid-19 screening using chest CT images. Comput Biol Med 137:104835. https://doi.org/10.1016/j.compbiomed.2021.104835 Nguyen TT, Nguyen QV, Nguyen DT, Hsu EB, Yang S, Eklund P (2020) Artificial intelligence in the battle against coronavirus (COVID-19): a survey and future research directions. arXiv 2008:07343. Pathan S, Siddalingaswamy PC, Ali T (2021) Automated Detection of Covid-19 from Chest X-ray scans using an optimized CNN architecture. Appl Soft Comput 104:107238. https://doi.org/10.1016/j.asoc.2021.107238 Shi F, Wang J, Shi J, Wu Z, Wang Q, Tang Z, He K, Shi Y, Shen D (2020) Review of artificial intelligence techniques in imaging data acquisition, segmentation, and diagnosis for COVID-19. IEEE Rev Biomed Eng 14:4–15. https://doi.org/10.1109/rbme.2020.2987975 Coppock H, Gaskell A, Tzirakis P, Baird A, Jones L, Schuller B (2021) End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: a pilot study. BMJ Innov 7:356–362. https://doi.org/10.1136/bmjinnov-2021-000668 Tena A, Clarià F, Solsona F (2022) Automated detection of COVID-19 cough. Biomed Signal Process Control 71:103175. https://doi.org/10.1016/j.bspc.2021.103175 Coppock H, Jones L, Kiskin I, Schuller B (2021) COVID-19 detection from audio: seven grains of salt. Lancet Digit Health 3(9):e537–e538. https://doi.org/10.1016/s2589-7500(21)00141-2 Akhtar A, Akhtar S, Bakhtawar B, Kashif AA, Aziz N, Javeid MS (2021) COVID-19 detection from CBC using machine learning techniques. Int J Technol Innov Manage. 1(2):65–78. https://doi.org/10.54489/ijtim.v1i2.22 Ferrari D, Motta A, Strollo M, Banfi G, Locatelli M (2020) Routine blood tests as a potential diagnostic tool for COVID-19. Clin Chem Lab Med 58(7):1095–1099. https://doi.org/10.1515/cclm-2020-0398 Alballa N, Al-Turaiki I (2021) Machine learning approaches in COVID-19 diagnosis, mortality, and severity risk prediction: a review. Inform Med Unlocked 3:100564. https://doi.org/10.1016/j.imu.2021.100564 Chadaga K, Prabhu S, Vivekananda BK, Niranjana S, Umakanth S (2021) Battling COVID-19 using machine learning: a review. Cogent Eng 8(1):1958666. https://doi.org/10.1080/23311916.2021.1958666 AlJame M, Ahmad I, Imtiaz A, Mohammed A (2020) Ensemble learning model for diagnosing COVID-19 from routine blood tests. Inform Med Unlocked 21:100449. https://doi.org/10.1016/j.imu.2020.100449 Alves MA, Castro GZ, Oliveira BA, Ferreira LA, Ramírez JA, Silva R, Guimarães FG (2021) Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs. Comput Biol Med 132:104335. https://doi.org/10.1016/j.compbiomed.2021.104335 Muhammad LJ, Algehyne EA, Usman SS, Ahmad A, Chakraborty C, Mohammed IA (2021) Supervised machine learning models for prediction of COVID-19 infection using epidemiology dataset. SN Comput Sci 2(1):1–3. https://doi.org/10.1007/s42979-020-00394-7 Brinati D, Campagner A, Ferrari D, Locatelli M, Banfi G, Cabitza F (2020) Detection of COVID-19 infection from routine blood exams with machine learning: a feasibility study. J Med Syst 44(8):1–2. https://doi.org/10.1101/2020.04.22.20075143 Soares F. A novel specific artificial intelligence-based method to identify COVID-19 cases using simple blood exams. MedRxiv. Schwab P, Schütte AD, Dietz B, Bauer S (2020) Clinical predictive models for COVID-19: systematic study. J Med Internet Res 22(10):e21439. https://doi.org/10.2196/preprints.21439 Cabitza F, Campagner A, Ferrari D, Di Resta C, Ceriotti D, Sabetta E, Colombini A, De Vecchi E, Banfi G, Locatelli M, Carobene A (2021) Development, evaluation, and validation of machine learning models for COVID-19 detection based on routine blood tests. Clin Chem Lab Med 59(2):421–431. https://doi.org/10.1515/cclm-2020-1294 Surkova E, Nikolayevskyy V, Drobniewski F (2020) False-positive COVID-19 results: hidden problems and costs. Lancet Respir Med 8(12):1167–1168. https://doi.org/10.1016/s2213-2600(20)30453-7 Oulefki A, Agaian S, Trongtirakul T, Laouar AK (2021) Automatic COVID-19 lung infected region segmentation and measurement using CT-scans images. Pattern Recogn 114:107747. https://doi.org/10.1016/j.patcog.2020.107747 Hao W, Li M (2020) Clinical diagnostic value of CT imaging in COVID-19 with multiple negative RT-PCR testing. Travel Med Infect Dis 34:101627. https://doi.org/10.1016/j.tmaid.2020.101627 Shaverdian N, Shepherd AF, Rimner A, Wu AJ, Simone CB II, Gelblum DY, Gomez DR (2020) Need for caution in the diagnosis of radiation pneumonitis during the covid-19 pandemic. Adv Radiat Oncol 5(4):617–620. https://doi.org/10.1016/j.adro.2020.04.015 Ismael AM, Şengür A (2021) Deep learning approaches for COVID-19 detection based on chest X-ray images. Expert Syst Appl 164:114054. https://doi.org/10.1016/j.eswa.2020.114054 Yang HS, Hou Y, Vasovic LV, Steel PA, Chadburn A, Racine-Brzostek SE, Velu P, Cushing MM, Loda M, Kaushal R, Zhao Z (2020) Routine laboratory blood tests predict SARS-CoV-2 infection using machine learning. Clin Chem 66(11):1396–1404. https://doi.org/10.1093/clinchem/hvaa200 Li WT, Ma J, Shende N, Castaneda G, Chakladar J, Tsai JC, Apostol L, Honda CO, Xu J, Wong LM, Zhang T (2020) Using machine learning of clinical data to diagnose COVID-19: a systematic review and meta-analysis. BMC Med Inform Decis Mak 20(1):1–3. https://doi.org/10.1186/s12911-020-01266-z Lesbon JC, Poleti MD, de Mattos Oliveira EC, Patané JS, Clemente LG, Viala VL, Ribeiro G, Giovanetti M, de Alcantara LC, de Lima LP, Nucleocapsid MAJ (2021) Gene mutations of SARS-CoV-2 can affect real-time RT-PCR diagnostic and impact false-negative results. Viruses 13(12):2474 Bayat V, Phelps S, Ryono R, Lee C, Parekh H, Mewton J, Sedghi F, Etminani P, Holodniy M (2020) A SARS-CoV-2 prediction model from standard laboratory tests. Clin Infect Dis 73(9):e2901–e2907. https://doi.org/10.1093/cid/ciaa1175 Wu J, Zhang P, Zhang L, Meng W, Li J, Tong C, Li Y, Cai J, Yang Z, Zhu J, Zhao M (2020) Rapid and accurate identification of COVID-19 infection through machine learning based on clinical available blood test results. MedRxiv. Kukar M, Gunčar G, Vovko T, Podnar S, Černelč P, Brvar M, Zalaznik M, Notar M, Moškon S, Notar M (2021) COVID-19 diagnosis by routine blood tests using machine learning. Sci Rep 11(1):1–9. https://doi.org/10.1038/s41598-021-90265-9 Fernandes FT, de Oliveira TA, Teixeira CE, de Moraes Batista AF, Dalla Costa G, Chiavegatto Filho AD (2021) A multipurpose machine learning approach to predict COVID-19 negative prognosis in São Paulo, Brazil. Sci Rep 11(1):1–7. https://doi.org/10.1038/s41598-021-82885-y Plante TB, Blau AM, Berg AN, Weinberg AS, Jun IC, Tapson VF, Kanigan TS, Adib AB (2020) Development and external validation of a machine learning tool to rule out COVID-19 among adults in the emergency department using routine blood tests: a large, multicenter, real-world study. J Med Internet Res 22(12):e24048. https://doi.org/10.2196/preprints.24048 Arpaci I, Huang S, Al-Emran M, Al-Kabi MN, Peng M (2021) Predicting the COVID-19 infection with fourteen clinical features using machine learning classification algorithms. Multimedia Tools Appl 80(8):11943–11957. https://doi.org/10.1007/s11042-020-10340-7 dos Santos Santana ÍV, da Silveira AC, Sobrinho Á, Silva LC, da Silva LD, Santos DF, Gurjão EC, Perkusich A (2021) Classification models for COVID-19 test prioritization in Brazil: machine learning approach. J Med Internet Res 23(4):e27293. https://doi.org/10.2196/preprints.27293 Goodman-Meza D, Rudas A, Chiang JN, Adamson PC, Ebinger J, Sun N, Botting P, Fulcher JA, Saab FG, Brook R, Eskin E (2020) A machine learning algorithm to increase COVID-19 inpatient diagnostic capacity. PLoS ONE 15(9):e0239474. https://doi.org/10.1371/journal.pone.0239474 Gangloff C, Rafi S, Bouzillé G, Soulat L, Cuggia M (2021) Machine learning is the key to diagnose COVID-19: a proof-of-concept study. Sci Rep 11(1):1–1. https://doi.org/10.1038/s41598-021-86735-9 de Freitas Barbosa VA, Gomes JC, de Santana MA, Jeniffer ED, de Souza RG, de Souza RE, dos Santos WP (2021) Heg. IA: an intelligent system to support diagnosis of Covid-19 based on blood tests. Res Biomed Eng 7:1–8. https://doi.org/10.1007/s42600-020-00112-5 Rikan SB, Azar AS, Ghafari A, Mohasefi JB, Pirnejad H (2022) COVID-19 diagnosis from routine blood tests using Artificial Intelligence techniques. Biomed Signal Process Control 72:103263. https://doi.org/10.1016/j.bspc.2021.103263 Nan SN, Ya Y, Ling TL, Nv GH, Ying PH, Bin J (2020) A prediction model based on machine learning for diagnosing the early COVID-19 patients. medRxiv Li WT, Ma J, Shende N, Castaneda G, Chakladar J, Tsai JC, Apostol L, Honda CO, Xu J, Wong LM, Zhang T (2020) Using machine learning of clinical data to diagnose covid-19. medRxiv Meng Z, Wang M, Song H, Guo S, Zhou Y, Li W, Zhou Y, Li M, Song X, Zhou Y, Li Q (2020) Development and utilization of an intelligent application for aiding COVID-19 diagnosis. medRxiv Xu W, Sun NN, Gao HN, Chen ZY, Yang Y, Ju B, Tang LL (2021) Risk factors analysis of COVID-19 patients with ARDS and prediction based on machine learning. Sci Rep 11(1):1–2. https://doi.org/10.1038/s41598-021-82492-x Abdulkareem KH, Mohammed MA, Salim A, Arif M, Geman O, Gupta D, Khanna A (2021) Realizing an effective COVID-19 diagnosis system based on machine learning and IOT in smart hospital environment. IEEE Internet Things J 8(21):15919–15928. https://doi.org/10.1109/jiot.2021.3050775 Willette AA, Willette SA, Wang Q, Pappas C, Klinedinst BS, Le S, Larsen B, Pollpeter A, Li T, Brenner N, Waterboer T (2021) Using machine learning to predict COVID-19 infection and severity risk among 4,510 aged adults: a UK Biobank cohort study. medRxiv. Banerjee A, Ray S, Vorselaars B, Kitson J, Mamalakis M, Weeks S, Baker M, Mackenzie LS (2020) Use of machine learning and artificial intelligence to predict SARS-CoV-2 infection from full blood counts in a population. Int Immunopharmacol 86:106705. https://doi.org/10.1016/j.intimp.2020.106705 Darapaneni N, Gupta M, Paduri AR, Agrawal R, Padasali S, Kumari A, Purushothaman P (2021) A Novel machine learning based screening method for high-risk Covid-19 patients based on simple blood exams. IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (pp. 1–6). https://doi.org/10.1109/iemtronics52119.2021.9422534 Tschoellitsch T, Dünser M, Böck C, Schwarzbauer K, Meier J (2021) Machine learning prediction of sars-cov-2 polymerase chain reaction results with routine blood tests. Lab Med 52(2):146–149. https://doi.org/10.1093/labmed/lmaa111 Delafiori J, Navarro LC, Siciliano RF, De Melo GC, Busanello EN, Nicolau JC, Sales GM, De Oliveira AN, Val FF, De Oliveira DN, Eguti A (2021) Covid-19 automated diagnosis and risk assessment through metabolomics and machine learning. Anal Chem 93(4):2471–2479. https://doi.org/10.1021/acs.analchem.0c04497.s001 Ning W, Lei S, Yang J, Cao Y, Jiang P, Yang Q, Zhang J, Wang X, Chen F, Geng Z, Xiong L (2020) Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nat Biomed Eng 4(12):1197–1207. https://doi.org/10.1038/s41551-020-00633-5 Soltan AA, Kouchaki S, Zhu T, Kiyasseh D, Taylor T, Hussain ZB, Peto T, Brent AJ, Eyre DW, Clifton DA (2021) Rapid triage for COVID-19 using routine clinical data for patients attending hospital: development and prospective validation of an artificial intelligence screening test. Lancet Digit Health 3(2):78–87. https://doi.org/10.1016/s2589-7500(20)30274-0 Silveira EC (2020) Prediction of covid-19 from hemogram results and age using machine learning. Frontiers in Health Informatics. 9(1):39. https://doi.org/10.30699/fhi.v9i1.234 Singh RK, Sinha S, Ramasamy A, Kannan S, Tambi G, Basu M (2020) COVID–19 AI diagnostic tool using only 13 common blood parameters. Int J Inf Technol 6(5):220–225. https://doi.org/10.33144/24545414/IJIT-V6I6P1 Kaggle (2020), Einstein Data4u. Accessed 22 June 2021, https://www.kaggle.com/einsteindata4u/covid19/version/4 Zhong Q, Peng J (2021) Mean platelet volume/platelet count ratio predicts severe pneumonia of COVID-19. J Clin Lab Anal 35(1):e23607. https://doi.org/10.1002/jcla.23607 Khartabil TA, Russcher H, van der Ven A, De Rijke YB (2020) A summary of the diagnostic and prognostic value of hemocytometry markers in COVID-19 patients. Crit Rev Clin Lab Sci 57(6):415–431. https://doi.org/10.1080/10408363.2020.1774736 Maddani SS, Gupta N, Umakanth S, Joylin S, Saravu K (2021) Neutrophil–lymphocyte ratio as a simple tool to predict requirement of admission to a critical care unit in patients with COVID-19. Indian J Crit Care Med 25(5):536–539. https://doi.org/10.5005/jp-journals-10071-23801 Dai W, Ke PF, Li ZZ, Zhuang QZ, Huang W, Wang Y, Xiong Y, Huang XZ (2021) Establishing classifiers with clinical laboratory indicators to distinguish COVID-19 from community-acquired pneumonia: retrospective cohort study. J Med Internet Res 23(2):e23390. https://doi.org/10.2196/23390 Kahn R, Schmidt T, Golestani K, Mossberg A, Gullstrand B, Bengtsson AA, Kahn F (2021) Mismatch between circulating cytokines and spontaneous cytokine production by leukocytes in hyperinflammatory COVID-19. J Leukoc Biol 109(1):115–120. https://doi.org/10.1002/jlb.5covbcr0720-310rr Tabachnikova A, Chen ST (2020) Roles for eosinophils and basophils in COVID-19? Nat Rev Immunol 20(8):461–474. https://doi.org/10.1038/s41577-020-0379-1 Gómez-Rial J, Rivero-Calle I, Salas A, Martinón-Torres F (2020) Role of monocytes/macrophages in Covid-19 pathogenesis: implications for therapy. Infect Drug Resist 13:2485–2489. https://doi.org/10.2147/IDR.S258639 Thachil J (2020) What do monitoring platelet counts in COVID-19 teach us? J Thromb Haemost 18(8):2071–2072. https://doi.org/10.1111/j.1538-7836.2011.04279.x Oshiro TM, Perez PS, Baranauskas JA (2012) How many trees in a random forest?. In: International workshop on machine learning and data mining in pattern recognition (pp. 154–168). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31537-4_13 Peterson LE (2009) K-nearest neighbor. Scholarpedia. https://doi.org/10.4249/scholarpedia.1883 Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H (2015) Xgboost: extreme gradient boosting. R package version 0.4–2. https://doi.org/10.1145/2939672.2939785 Menard S (2002) Applied logistic regression analysis. Sage, New York. https://doi.org/10.4135/9781412983433 De Cock M, Dowsley R, Nascimento AC, Railsback D, Shen J, Todoki A (2021) High performance logistic regression for privacy-preserving genome analysis. BMC Med Genomics 14(1):1–8. https://doi.org/10.1186/s12920-020-00869-9 Fernández A, Garcia S, Herrera F, Chawla NV (2018) SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Artif Intell Res 61:863–905. https://doi.org/10.1613/jair.1.11192 Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830. https://doi.org/10.5555/1953048.2078195 Molnar C (2019) Interpretable machine learning. https://christophm.github.io/interpretable-ml-book/ Holzinger A, Langs G, Denk H, Zatloukal K, Müller H (2019) Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev 9:e1312. https://doi.org/10.1002/widm.1312 Parsa AB, Movahedi A, Taghipour H, Derrible S, Mohammadian AK (2020) Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid Anal Prev 136:105405. https://doi.org/10.1016/j.aap.2019.105405 Formica V, Minieri M, Bernardini S, Ciotti M, D’Agostini C, Roselli M, Andreoni M, Morelli C, Parisi G, Federici M, Paganelli C (2020) Complete blood count might help to identify subjects with high probability of testing positive to SARS-CoV-2. Clin Med 20(4):e114. https://doi.org/10.7861/clinmed.2020-0373 Avila E, Kahmann A, Alho C, Dorn M (2020) Hemogram data as a tool for decision-making in COVID-19 management: applications to resource scarcity scenarios. PeerJ 8:e9482. https://doi.org/10.7717/peerj.9482 Joshi RP, Pejaver V, Hammarlund NE, Sung H, Lee SK, Furmanchuk AO, Lee HY, Scott G, Gombar S, Shah N, Shen S (2020) A predictive tool for identification of SARS-CoV-2 PCR-negative emergency department patients using routine test results. J Clin Virol 129:104502. https://doi.org/10.1016/j.jcv.2020.104502