Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark

Medical Image Analysis - Tập 86 - Trang 102770 - 2023
Martin Wagner1,2, Beat-Peter Müller-Stich1,2, Anna Kisilenko1,2, Duc Tran1,2, Patrick Heger1, Lars Mündermann3, David M Lubotsky1,2, Benjamin Müller1,2, Tornike Davitashvili1,2, Manuela Capek1,2, Annika Reinke4,5,6, Carissa Reid7, Tong Yu8,9, Armine Vardazaryan8,9, Chinedu Innocent Nwoye8,9, Nicolas Padoy8,9, Xinyang Liu10, Eung-Joo Lee11, Constantin Disch12, Hans Meine12,13
1Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany
2National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
3Data Assisted Solutions, Corporate Research & Technology, KARL STORZ SE & Co. KG, Dr. Karl-Storz-Str. 34, 78332 Tuttlingen
4Div. Computer Assisted Medical Interventions, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany
5HIP Helmholtz Imaging Platform, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany
6Faculty of Mathematics and Computer Science, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg
7Division of Biostatistics, German Cancer Research Center, Im Neuenheimer Feld 280, Heidelberg, Germany
8ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France
9IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
10Sheikh Zayed Institute for Pediatric Surgical Innovation, Children’s National Hospital, 111 Michigan Ave. NW, Washington, DC, 20010, USA
11University of Maryland, College Park, 2405 A V Williams Building, College Park, MD 20742, USA
12Fraunhofer Institute for Digital Medicine MEVIS, Max-von-Laue-Str. 2, 28359 Bremen, Germany
13University of Bremen, FB3, Medical Image Computing Group, ℅ Fraunhofer MEVIS, Am Fallturm 1, 28359 Bremen, Germany

Tài liệu tham khảo

Ahmidi, 2017, A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery, IEEE Trans. Biomed. Eng., 64, 2025, 10.1109/TBME.2016.2647680 Aksamentov, 2017, Deep Neural Networks Predict Remaining Surgery Duration from Cholecystectomy Videos, 586 Al Hajj, 2019, CATARACTS: challenge on automatic tool annotation for cataRACT surgery, Med. Image Anal., 52, 24, 10.1016/j.media.2018.11.008 Andall, 2016, The clinical anatomy of cystic artery variations: a review of over 9800 cases, Surg. Radiol. Anat., 38, 529, 10.1007/s00276-015-1600-y Bar, 2020, Impact of data on generalization of AI for surgical intelligence applications, Sci. Rep., 10, 22208, 10.1038/s41598-020-79173-6 Bodenstedt, 2019, Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data, Int. J. Comput. Assist. Radiol. Surg., 14, 1089, 10.1007/s11548-019-01966-6 Bürkner, 2017, brms : an R package for Bayesian multilevel models using Stan, J. Stat. Softw., 80, 10.18637/jss.v080.i01 Carreira, 2017, Quo Vadis, action recognition? A new model and the kinetics dataset, 4724 Chang, 2007, Reliable assessment of laparoscopic performance in the operating room using videotape analysis, Surg. Innov., 14, 122, 10.1177/1553350607301742 Doyle, 2007, A universal global rating scale for the evaluation of technical skills in the operating room, Am. J. Surg., 193, 551, 10.1016/j.amjsurg.2007.02.003 Fleiss, 1971, Measuring nominal scale agreement among many raters, Psychol. Bull., 76, 378, 10.1037/h0031619 Funke, 2019, Video-based surgical skill assessment using 3D convolutional neural networks, Int. J. Comput. Assist. Radiol. Surg., 14, 1217, 10.1007/s11548-019-01995-1 Garrow, 2020, Machine learning for surgical phase recognition: a systematic review, Ann. Surg. Greenberg, 2018, A statewide surgical coaching program provides opportunity for continuous professional development, Ann. Surg., 267, 868, 10.1097/SLA.0000000000002341 Hashimoto, 2019, Computer vision analysis of intraoperative video: automated recognition of operative steps in laparoscopic sleeve gastrectomy, Ann. Surg., 270, 414, 10.1097/SLA.0000000000003460 He, K., Zhang, X., Ren, S., Sun, J., 2015. Deep residual learning for image recognition. ArXiv151203385 Cs. HeiChole Benchmark Website, 2022. www.synapse.org/heichole [WWW Document]. 10.7303/syn18824884. Hinton, G., Srivastava, N., Swersky, K., 2012. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Hu, 2018, Squeeze-and-excitation networks, 7132 Jin, 2018, SV-RCNet: workflow recognition from surgical videos using recurrent convolutional network, IEEE Trans. Med. Imaging, 37, 1114, 10.1109/TMI.2017.2787657 Katić, 2013, Context-aware Augmented Reality in laparoscopic surgery, Comput. Med. Imaging Graph., 37, 174, 10.1016/j.compmedimag.2013.03.003 Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A., 2017. The kinetics human action video dataset. Kendall, 1938, A new measure of rank correlation, Biometrika, 30, 81, 10.1093/biomet/30.1-2.81 Kiefer, 1952, Stochastic estimation of the maximum of a regression function, Ann. Math. Stat., 23, 462, 10.1214/aoms/1177729392 Kingma, D.P., Ba, J., 2017. Adam: a method for stochastic optimization. ArXiv14126980 Cs. Kipp, 2014 Korndorffer, 2020, Situating artificial intelligence in surgery: a focus on disease severity, Ann. Surg., 272, 523, 10.1097/SLA.0000000000004207 Lalys, 2014, Surgical process modelling: a review, Int. J. Comput. Assist. Radiol. Surg., 9, 495, 10.1007/s11548-013-0940-5 Loukas, 2018, Video content analysis of surgical procedures, Surg. Endosc., 32, 553, 10.1007/s00464-017-5878-1 Maier-Hein, 2018, Why rankings of biomedical image analysis competitions should be interpreted with care, Nat. Commun., 9, 5217, 10.1038/s41467-018-07619-7 Maier-Hein, 2022, Surgical data science - from concepts toward clinical translation, Med. Image Anal., 76, 10.1016/j.media.2021.102306 Maier-Hein, 2020, BIAS: transparent reporting of biomedical image analysis challenges, Med. Image Anal., 66, 10.1016/j.media.2020.101796 Maier-Hein, 2017, Surgical data science for next-generation interventions, Nat. Biomed. Eng., 1, 691, 10.1038/s41551-017-0132-7 Maier-Hein, 2021, Heidelberg colorectal data set for surgical data science in the sensor operating room, Sci. Data, 8, 101, 10.1038/s41597-021-00882-2 Mascagni, 2021, A computer vision platform to automatically locate critical events in surgical videos: documenting safety in laparoscopic cholecystectomy, Ann. Surg. Publish Ahead of Print. Meireles, 2021, SAGES consensus recommendations on an annotation framework for surgical video, Surg. Endosc., 10.1007/s00464-021-08578-9 Neumuth, 2009, Validation of knowledge acquisition for surgical process models, J. Am. Med. Inform. Assoc., 16, 72, 10.1197/jamia.M2748 Nwoye, 2019, Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos, Int. J. Comput. Assist. Radiol. Surg., 14, 1059, 10.1007/s11548-019-01958-6 Qiu, 2017, Learning spatio-temporal representation with pseudo-3D residual networks, 5534 Roß, T., Bruno, P., Reinke, A., Wiesenfarth, M., Koeppel, L., Full, P.M., Pekdemir, B., Godau, P., Trofimova, D., Isensee, F., Moccia, S., Calimeri, F., Müller-Stich, B.P., Kopp-Schneider, A., Maier-Hein, L., 2021. How can we learn (more) from challenges? A statistical approach to driving future algorithm development. ArXiv210609302 Cs. Russakovsky, 2015, ImageNet large scale visual recognition challenge, Int. J. Comput. Vis., 115, 211, 10.1007/s11263-015-0816-y Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A., 2016. Inception-v4, inception-ResNet and the impact of residual connections on learning. ArXiv160207261 Cs. Tanzi, 2020, Intraoperative surgery room management: a deep learning perspective, Int. J. Med. Robot., 10.1002/rcs.2136 Topol, 2019, High-performance medicine: the convergence of human and artificial intelligence, Nat. Med., 25, 44, 10.1038/s41591-018-0300-7 Twinanda, 2017, EndoNet: a deep architecture for recognition tasks on laparoscopic videos, IEEE Trans. Med. Imaging, 36, 86, 10.1109/TMI.2016.2593957 Vardazaryan, 2018, Weakly-supervised learning for tool localization in laparoscopic videos, 169 Vassiliou, 2005, A global assessment tool for evaluation of intraoperative laparoscopic skills, Am. J. Surg., 190, 107, 10.1016/j.amjsurg.2005.04.004 Vedula, 2017, Objective assessment of surgical technical skill and competency in the operating room, Annu. Rev. Biomed. Eng., 19, 301, 10.1146/annurev-bioeng-071516-044435 Vercauteren, T., Unberath, M., Padoy, N., Navab, N., 2020. CAI4CAI: the rise of contextual artificial intelligence in computer-assisted interventions. Proc. IEEE 108, 198–214. 10.1109/JPROC.2019.2946993. Wagner, 2021, A learning robot for cognitive camera control in minimally invasive surgery, Surg. Endosc., 35, 5365, 10.1007/s00464-021-08509-8 Wang, 2016, Temporal segment networks: towards good practices for deep action recognition, 20 Wiesenfarth, 2021, Methods and open-source toolkit for analyzing and visualizing challenge results, Sci. Rep., 11, 2369, 10.1038/s41598-021-82017-6 Xie, 2017, Aggregated residual transformations for deep neural networks, 5987