A scalable artificial intelligence platform that automatically finds copy number variations (CNVs) in journal articles and transforms them into a database: CNV extraction, transformation, and loading AI (CNV-ETLAI)

Computers in Biology and Medicine - Tập 144 - Trang 105332 - 2022
Jongmun Choi1,2,3, Soomin Jeon1, Doyun Kim1, Michelle Chua1, Synho Do1
1Department of Radiology, Laboratory of Medical Imaging and Computation, Massachusetts General Brigham and Harvard Medical School, Boston, MA, USA
2Department of Laboratory Medicine, Hanyang University College of Medicine, Seoul, South Korea
3GC Genome, GC Laboratories, Yong-in, South Korea

Tài liệu tham khảo

Manning, 2010, Professional Practice and Guidelines Committee. Array-based technology and recommendations for utilization in medical genetics practice for detection of chromosomal abnormalities, Genet. Med., 12, 742, 10.1097/GIM.0b013e3181f8baad Miller, 2010, Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies, Am. J. Hum. Genet., 86, 749, 10.1016/j.ajhg.2010.04.006 Robson, 2017, Efficacy Mech. Eval., 4 Wu, 2017, The clinical use of chromosomal microarray analysis in detection of fetal chromosomal rearrangements: a study from China Mainland, Eur. J. Obstet. Gynecol. Reprod. Biol., 212, 44, 10.1016/j.ejogrb.2017.03.007 Rickman, 2005, Prenatal diagnosis by array-CGH, Eur. J. Med. Genet., 48, 232, 10.1016/j.ejmg.2005.03.003 Lee, 2012, Clinical utility of array comparative genomic hybridisation for prenatal diagnosis: a cohort study of 3171 pregnancies, BJOG, 119, 614, 10.1111/j.1471-0528.2012.03279.x Levy, 2018, Prenatal diagnosis by chromosomal microarray analysis, Fertil. Steril., 109, 201, 10.1016/j.fertnstert.2018.01.005 Levy, 2019, Are all chromosome microarrays the same? What clinicians need to know, Prenat. Diagn., 39, 157, 10.1002/pd.5422 MacDonald, 2014, The Database of Genomic Variants: a curated collection of structural variation in the human genome, Nucleic Acids Res., 42, D986, 10.1093/nar/gkt958 Firth, 2009, DECIPHER: database of chromosomal imbalance and phenotype in humans using ensembl resources, Am. J. Hum. Genet., 84, 524, 10.1016/j.ajhg.2009.03.010 Landrum, 2014, ClinVar: public archive of relationships among sequence variation and human phenotype, Nucleic Acids Res., 42, D980, 10.1093/nar/gkt1113 Rehm, 2015, ClinGen–the clinical genome resource, N. Engl. J. Med., 372, 2235, 10.1056/NEJMsr1406261 Siva, 2008, 1000 Genomes project, Nat. Biotechnol., 26, 256, 10.1038/nbt0308-256b Collins, 2020, A structural variation reference for medical and population genetics, Nature, 581, 444, 10.1038/s41586-020-2287-8 Sudmant, 2015, Global diversity, population stratification, and selection of human copy-number variation, Science, 349, 10.1126/science.aab3761 Nowakowska, 2017, Clinical interpretation of copy number variants in the human genome, J. Appl. Genet., 58, 449, 10.1007/s13353-017-0407-4 Haeussler Meyer, 2013, The UCSC Genome Browser database: extensions and updates 2013, Nucleic Acids Res., 41, D64, 10.1093/nar/gks1048 Levenshtein, 1965, Binary codes capable of correcting deletions, insertions, and reversals, Sov. Phys. Dokl., 163, 845 R. Liu, J.X. McKie, PyMuPDF. http://pymupdf.readthedocs.io/en/latest/ (Up-dated on Sep 2021). Sandler, 2018, MobilenetV2: inverted residuals and linear bottlenecks, IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recogn., 4510 Paszke, 2019, PyTorch: an imperative style, high-performance deep learning library, Adv. Neural Inf. Process. Syst., 32, 8026 Kingma, 2014, Adam: a method for stochastic optimization, ArXiv, 1412.6980 2020 E. Loper, S. Bird, NLTK: the Natural Language Toolkit. arXiv Preprint Cs/0205028. 2002 May 17. Tilkov, 2010, Node. js: using JavaScript to build high-performance network programs, IEEE Internet Comput., 14, 80, 10.1109/MIC.2010.145 Robinson, 2020, igv. js: an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV), bioRxiv Riggs, 2020, Technical standards for the interpretation and reporting of constitutional copy-number variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resource (ClinGen), Genet. Med., 22, 245, 10.1038/s41436-019-0686-8 Grande, 2015, Genomic microarray in fetuses with increased nuchal translucency and normal karyotype: a systematic review and meta-analysis, Ultrasound Obstet. Gynecol., 46, 650, 10.1002/uog.14880 Brady, 2014, A prospective study of the clinical utility of prenatal chromosomal microarray analysis in fetuses with ultrasound abnormalities and an exploration of a framework for reporting unclassified variants and risk factors, Genet. Med., 16, 469, 10.1038/gim.2013.168 Xia, 2020, Application of chromosome microarray analysis in prenatal diagnosis, BMC Pregnancy Childbirth, 20, 696, 10.1186/s12884-020-03368-y Qiu, 2012, CNVD: text mining-based copy number variation in disease database, Hum. Mutat., 33, E2375, 10.1002/humu.22163 Yang, 2018, Constructing a database for the relations between CNV and human genetic diseases via systematic text mining, BMC Bioinf., 19, 528, 10.1186/s12859-018-2526-2 Farré, 2013, Recombination rates and genomic shuffling in human and chimpanzee-a new twist in the chromosomal speciation theory, Mol. Biol. Evol., 30, 853, 10.1093/molbev/mss272