Trí tuệ nhân tạo trong chẩn đoán lâm sàng và gen
Tóm tắt
Trí tuệ nhân tạo (AI) là sự phát triển của các hệ thống máy tính có khả năng thực hiện các nhiệm vụ thông thường đòi hỏi trí thông minh của con người. Sự tiến bộ trong phần mềm và phần cứng AI, đặc biệt là các thuật toán học sâu và các bộ xử lý đồ họa (GPU) hỗ trợ quá trình đào tạo của chúng, đã dẫn đến sự quan tâm ngày càng tăng về ứng dụng AI trong y tế. Trong chẩn đoán lâm sàng, các phương pháp tầm nhìn máy tính dựa trên AI đang chuẩn bị cách mạng hóa chẩn đoán dựa trên hình ảnh, trong khi những loại AI khác cũng đã cho thấy tiềm năng tương tự trong các phương pháp chẩn đoán khác nhau. Trong một số lĩnh vực, chẳng hạn như gen lâm sàng, một loại thuật toán AI đặc biệt gọi là học sâu được sử dụng để xử lý các tập dữ liệu gen lớn và phức tạp. Trong bài tổng quan này, chúng tôi đầu tiên tóm tắt các loại vấn đề chính mà các hệ thống AI có thể giải quyết hiệu quả và mô tả những nhiệm vụ chẩn đoán lâm sàng được hưởng lợi từ các giải pháp này. Tiếp theo, chúng tôi tập trung vào các phương pháp mới đang nổi lên cho các nhiệm vụ cụ thể trong gen lâm sàng, bao gồm xác định biến thể, chú thích bộ gen và phân loại biến thể, cũng như sự tương ứng giữa kiểu hình và kiểu gen. Cuối cùng, chúng tôi kết thúc bằng một cuộc thảo luận về tiềm năng tương lai của AI trong ứng dụng y học cá thể hóa, đặc biệt là cho dự đoán rủi ro trong các bệnh phức tạp phổ biến, cùng với các thách thức, hạn chế và thành kiến cần được giải quyết cẩn thận để triển khai thành công AI trong ứng dụng y tế, đặc biệt là những ứng dụng sử dụng dữ liệu di truyền và gen của con người.
Từ khóa
Tài liệu tham khảo
Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019;25:24–9.
Fraser KC, Meltzer JA, Rudzicz F. Linguistic features identify Alzheimer’s disease in narrative speech. J Alzheimers Dis. 2016;49:407–22.
Rajkomar A, Oren E, Chen K, Dai AM, Hajaj N, Liu PJ, et al. Scalable and accurate deep learning for electronic health records. NPJ Digit Med. 2018;1:18. https://doi.org/10.1038/s41746-018-0029-1.
Zou J, Huss M, Abid A, Mohammadi P, Torkamani A, Telenti A. A primer on deep learning in genomics. Nat Genet. 2019;51:12–8.
Eraslan G, Avsec Ž, Gagneur J, Theis FJ. Deep learning: new computational modelling techniques for genomics. Nat Rev Genet. 2019;20:389–403.
Retson TA, Besser AH, Sall S, Golden D, Hsiao A. Machine learning and deep neural networks in thoracic and cardiovascular imaging. J Thorac Imaging. 2019;34:192–201.
Asch FM, Abraham T, Jankowski M, Cleve J, Adams M, Romano N, et al. Accuracy and reproducibility of a novel artificial intelligence deep learning-based algorithm for automated calculation of ejection fraction in echocardiography. J Am Coll Cardiol. 2019;73(9 Supplement 1):1447. https://doi.org/10.1016/S0735-1097(19)32053-4.
Le EPV, Wang Y, Huang Y, Hickman S, Gilbert FJ. Artificial intelligence in breast imaging. Clin Radiol. 2019;74:357–66.
Majumdar A, Brattain L, Telfer B, Farris C, Scalera J. Detecting intracranial hemorrhage with deep learning. Conf Proc IEEE Eng Med Biol Soc. 2018;2018:583–7.
FDA approves stroke-detecting AI software. Nat Biotechnol. 2018;36:290. https://doi.org/10.1038/nbt0418-290.
Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316:2402–10.
van der Heijden AA, Abramoff MD, Verbraak F, van Hecke MV, Liem A, Nijpels G. Validation of automated screening for referable diabetic retinopathy with the IDx-DR device in the Hoorn diabetes care system. Acta Ophthalmol. 2018;96:63–8.
Evans AJ, Bauer TW, Bui MM, Cornish TC, Duncan H, Glassy EF, et al. US Food and Drug Administration approval of whole slide imaging for primary diagnosis: a key milestone is reached and new questions are raised. Arch Pathol Lab Med. 2018;142:1383–7.
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–8.
Niazi MKK, Parwani AV, Gurcan MN. Digital pathology and artificial intelligence. Lancet Oncol. 2019;29:e253–61.
Rios Velazquez E, Parmar C, Liu Y, Coroller TP, Cruz G, Stringfield O, et al. Somatic mutations drive distinct imaging phenotypes in lung cancer. Cancer Res. 2017;77:3922–30.
Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24:1559–67.
Gurovich Y, Hanani Y, Bar O, Nadav G, Fleischer N, Gelbman D, et al. Identifying facial phenotypes of genetic disorders using deep learning. Nat Med. 2019;25:60–4.
Dolgin E. AI face-scanning app spots signs of rare genetic disorders. Nature. 2019. https://doi.org/10.1038/d41586-019-00027-x.
Poplin R, Varadarajan AV, Blumer K, Liu Y, McConnell MV, Corrado GS, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nat Biomed Eng. 2018;2:158–64.
Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25:65–9.
Tison GH, Sanchez JM, Ballinger B, Singh A, Olgin JE, Pletcher MJ, et al. Passive detection of atrial fibrillation using a commercially available smartwatch. JAMA Cardiol. 2018;3:409–16.
Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat Med. 2019;25:70–4.
Galloway CD, Valys AV, Shreibati JB, Treiman DL, Petterson FL, Gundotra VP, et al. Development and validation of a deep-learning model to screen for hyperkalemia from the electrocardiogram. JAMA Cardiol. 2019;4:428–36.
Leung MKK, Xiong HY, Lee LJ, Frey BJ. Deep learning of the tissue-regulated splicing code. Bioinformatics. 2014;30:i121–9.
Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, Darbandi SF, Knowles D, Li YI, et al. Predicting splicing from primary sequence with deep learning. Cell. 2019;176:535–48.
Quang D, Xie X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 2016;44:e107.
Wang J, Cao H, Zhang JZH, Qi Y. Computational protein design with deep learning neural networks. Sci Rep. 2018;8:6349. https://doi.org/10.1038/s41598-018-24760-x.
Li J, Deng L, Haeb-Umbach R, Gong Y, Li J, Deng L, et al. Fundamentals of speech recognition. In: Li J, Deng L, Haeb-Umbach R, Gong Y, editors. Robust automatic speech recognition: a bridge to practical applications. Academic Press: New York; 2016. p. 9–40.
Parthasarathy S, Rozgic V, Sun M, Wang C. Improving emotion classification through variational inference of latent variables. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP)—Proceedings. IEEE. 2019:7410–4 https://ieeexplore.ieee.org/document/8682823. Accessed 31 Oct 2019.
Trigeorgis G, Ringeval F, Brueckner R, Marchi E, Nicolaou MA, Schuller B, et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP)—Proceedings. IEEE. 2016:5200–4 https://ieeexplore.ieee.org/document/7472669. Accessed 31 Oct 2019.
Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N, et al. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag. 2012;29:82–97.
Prabhavalkar R, Rao K, Sainath TN, Li B, Johnson L, Jaitly N. A Comparison of sequence-to-sequence models for speech recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech; 2017. https://doi.org/10.21437/Interspeech.2017-233.
Li Z, Huang J, Hu Z, Li Z, Huang J, Hu Z. Screening and diagnosis of chronic pharyngitis based on deep learning. Int J Environ Res Public Health. 2019;16. https://doi.org/10.3390/ijerph16101688.
Zhan A, Mohan S, Tarolli C, Schneider RB, Adams JL, Sharma S, et al. Using smartphones and machine learning to quantify Parkinson disease severity the mobile Parkinson disease score. JAMA Neurol. 2018;75:876–80.
Ringeval F, Schuller B, Valstar M, Ni C, Cowie R, Tavabi L, et al. AVEC 2019 workshop and challenge: state-of-mind, detecting depression with AI, and cross-cultural affect recognition. In: Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop. Nice; 2019. p. 3–12. https://doi.org/10.1145/3347320.3357688.
Marmar CR, Brown AD, Qian M, Laska E, Siegel C, Li M, et al. Speech-based markers for posttraumatic stress disorder in US veterans. Depress Anxiety. 2019;36:607–16.
Maor E, Sara JD, Orbelo DM, Lerman LO, Levanon Y, Lerman A. Voice signal characteristics are independently associated with coronary artery disease. Mayo Clin Proc. 2018;93:840–7.
Mohr DN, Turner DW, Pond GR, Kamath JS, De Vos CB, Carpenter PC. Speech recognition as a transcription aid: a randomized comparison with standard transcription. J Am Med Informatics Assoc. 2003;10:85–93.
Edwards E, Salloum W, Finley GP, Fone J, Cardiff G, Miller M, et al. Medical speech recognition: reaching parity with humans. In: Karpov A, Potapova R, Mporas I, editors. Speech and Computer. SPECOM 2017. Lecture notes in computer science, vol. 10458. Cham: Springer. p. 512–24. http://link.springer.com/10.1007/978-3-319-66429-3_51. Accessed 12 Aug 2019.
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, et al. Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv. 2016;arXiv:1609 08144.
Collobert R, Weston J. A unified architecture for natural language processing: deep neural networks with multitask learning. In: ICML '08. Proceedings of the 25th International Conference on Machine learning. Helsinki; 2008, 2008. p. 160–7. https://doi.org/10.1145/1390156.1390177.
Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep. 2016;6:26094. https://doi.org/10.1038/srep26094.
Chen J, Druhl E, Polepalli Ramesh B, Houston TK, Brandt CA, Zulman DM, et al. A natural language processing system that links medical terms in electronic health record notes to lay definitions: system development using physician reviews. J Med Internet Res. 2018;20:e26. https://doi.org/10.2196/jmir.8669.
Kohut K, Limb S, Crawford G. The changing role of the genetic counsellor in the genomics era. Curr Genet Med Rep. 2019;7:75–84.
Diller G-P, Kempny A, Babu-Narayan SV, Henrichs M, Brida M, Uebing A, et al. Machine learning algorithms estimating prognosis and guiding therapy in adult congenital heart disease: data from a single tertiary Centre including 10,019 patients. Eur Heart J. 2019;40:1069–77.
Liang H, Tsui BY, Ni H, Valentim CCS, Baxter SL, Liu G, et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat Med. 2019;25:433–8.
Clark MM, Hildreth A, Batalov S, Ding Y, Chowdhury S, Watkins K, et al. Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation. Sci Transl Med. 2019;11:eaat6177. https://doi.org/10.1126/scitranslmed.aat6177.
Li H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics. 2014;30:2843–51.
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–8.
Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv. 2012;arXiv:1207 3907.
Hwang S, Kim E, Lee I, Marcotte EM. Systematic comparison of variant calling pipelines using gold standard personal exome variants. Sci Rep. 2015;5:17875. https://doi.org/10.1038/srep17875.
Poplin R, Chang PC, Alexander D, Schwartz S, Colthurst T, Ku A, et al. A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol. 2018;36:983–7.
Wick RR, Judd LM, Holt KE. Performance of neural network basecalling tools for Oxford nanopore sequencing. Genome Biol. 2019;20:129. https://doi.org/10.1186/s13059-019-1727-y.
Tang H, Thomas PD. Tools for predicting the functional impact of nonsynonymous genetic variation. Genetics. 2016;203:635–47.
Quang D, Chen Y, Xie X. DANN: a deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics. 2015;31:761–3.
Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46:310–5.
Sundaram L, Gao H, Padigepati SR, McRae JF, Li Y, Kosmicki JA, et al. Predicting the clinical impact of human mutation with deep neural networks. Nat Genet. 2018;50:1161–70.
Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062–7.
Riesselman AJ, Ingraham JB, Marks DS. Deep generative models of genetic variation capture the effects of mutations. Nat Methods. 2018;15:816–22.
Chatterjee S, Ahituv N. Gene regulatory elements, major drivers of human disease. Annu Rev Genomics Hum Genet. 2017;18:45–63.
Soemedi R, Cygan KJ, Rhine CL, Wang J, Bulacan C, Yang J, et al. Pathogenic variants that alter protein code often disrupt splicing. Nat Genet. 2017;49:848–55.
Baeza-Centurion P, Miñana B, Schmiedel JM, Valcárcel J, Lehner B. Combinatorial genetics reveals a scaling law for the effects of mutations on splicing. Cell. 2019;176:549–63.
Kelley DR, Reshef YA, Bileschi M, Belanger D, McLean CY, Snoek J. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 2018;28:739–50.
Alipanahi B, Delong A, Weirauch MT, Frey BJ. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol. 2015;33:831–8.
Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, et al. The NIH roadmap epigenomics mapping consortium. Nat Biotechnol. 2010;28:1045–8.
Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods. 2015;12:931–4.
Zhou J, Park CY, Theesfeld CL, Wong AK, Yuan Y, Scheckel C, et al. Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. Nat Genet. 2019;51:973–80.
Zhou J, Theesfeld CL, Yao K, Chen KM, Wong AK, Troyanskaya OG. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat Genet. 2018;50:1171–9.
Telenti A, Pierce LCT, Biggs WH, Di Iulio J, Wong EHM, Fabani MM, et al. Deep sequencing of 10,000 human genomes. Proc Natl Acad Sci U S A. 2016;113:11901–6.
Erikson GA, Bodian DL, Rueda M, Molparia B, Scott ER, Scott-Van Zeeland AA, et al. Whole-genome sequencing of a healthy aging cohort. Cell. 2016;165:1002–11.
Köhler S, Carmody L, Vasilevsky N, Jacobsen JOB, Danis D, Gourdine JP, et al. Expansion of the human phenotype ontology (HPO) knowledge base and resources. Nucleic Acids Res. 2019;47:D1018–27.
Hsieh T-C, Mensah MA, Pantel JT, Aguilar D, Bar O, Bayat A, et al. PEDIA: prioritization of exome data by image analysis. Genet Med. 2019. https://doi.org/10.1038/s41436-019-0566-2.
Mobadersany P, Yousefi S, Amgad M, Gutman DA, Barnholtz-Sloan JS, Velázquez Vega JE, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115:E2970–9.
Bastarache L, Hughey JJ, Hebbring S, Marlo J, Zhao W, Ho WT, et al. Phenotype risk scores identify patients with unrecognized mendelian disease patterns. Science. 2018;359:1233–9.
Torkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat Rev Genet. 2018;19:581–90.
Lello L, Avery SG, Tellier L, Vazquez AI. de los Campos G, Hsu SDH. Accurate genomic prediction of human height. Genetics. 2018;210:477–97.
Lee A, Mavaddat N, Wilcox AN, Cunningham AP, Carver T, Hartley S, et al. BOADICEA: a comprehensive breast cancer risk prediction model incorporating genetic and nongenetic risk factors. Genet Med. 2019;21:1708–18.
Inouye M, Abraham G, Nelson CP, Wood AM, Sweeting MJ, Dudbridge F, et al. Genomic risk prediction of coronary artery disease in 480,000 adults. J Am Coll Cardiol. 2018;72:1883–93.
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56.
Lomas N. Google has used contract swaps to get bulk access terms to NHS patient data. TechCrunch. 2019; https://techcrunch.com/2019/10/22/google-has-used-contract-swaps-to-get-bulk-access-terms-to-nhs-patient-data/. Accessed 31 Oct 2019.
Vayena E, Blasimme A, Cohen IG. Machine learning in medicine: addressing ethical challenges. PLoS Med. 2018;15:e1002689. https://doi.org/10.1371/journal.pmed.1002689.
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. In: International Conference on Computer Vision (ICCV): IEEE; 2017. p. 618–26. http://ieeexplore.ieee.org/document/8237336/ Accessed 12 Aug 2019.
Olah C, Mordvintsev A, Schubert L. Feature visualization: how neural networks build up their understanding of images. Distill. 2017;2:e7 https://distill.pub/2017/feature-visualization. Accessed 12 Aug 2019.
Mittelstadt B, Russell C, Wachter S. Explaining explanations in AI. In: FAT* 2019. Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency. Atlanta; 2019. p. 29, 279–31, 288. https://doi.org/10.1145/3287560.3287574.
Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. arXiv. 2017;arXiv:1702 08608.
Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med. 2018;178:1544–7.
Sirugo G, Williams SM, Tishkoff SA. The missing diversity in human genetic studies. Cell. 2019;177:1080.
Lumaka A, Cosemans N, Lulebo Mampasi A, Mubungu G, Mvuama N, Lubala T, et al. Facial dysmorphism is influenced by ethnic background of the patient and of the evaluator. Clin Genet. 2017;92:166–71.
Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51:584–91.
Bolukbasi T, Chang K-W, Zou JY, Saligrama V, Kalai AT. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, editors. Advances in neural information processing systems 29. Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016). Barcelona. p. 4349–57. https://papers.nips.cc/paper/6228-man-is-to-computer-programmer-as-woman-is-to-homemaker-debiasing-word-embeddings.pdf Accessed 31 Oct 2019.
Yarnell CJ, Fu L, Manuel D, Tanuseputro P, Stukel T, Pinto R, et al. Association between immigrant status and end-of-life care in Ontario, Canada. JAMA. 2017;318:1479–88.
Sohail M, Maier RM, Ganna A, Bloemendal A, Martin AR, Turchin MC, et al. Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies. Elife. 2019;8. https://doi.org/10.7554/eLife.39702.
Chen IY, Szolovits P, Ghassemi M. Can AI help reduce disparities in general medical and mental health care? AMA J Ethics. 2019;21:E167–79.
Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. https://doi.org/10.1371/journal.pmed.1001779.