Incorporating biological structure into machine learning models in biomedicine

Current Opinion in Biotechnology - Tập 63 - Trang 126-134 - 2020
Jake Crawford1,2, Casey S Greene2,3
1Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
2Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
3Childhood Cancer Data Lab, Alex’s Lemonade Stand Foundation, Philadelphia, PA, United States

Tài liệu tham khảo

Cowen, 2017, Network propagation: a universal amplifier of genetic associations, Nat Rev Genet, 10.1038/nrg.2017.38 Yu, 2018, Visible Machine Learning for Biomedicine, Cell, 10.1016/j.cell.2018.05.056 2018, The gene ontology resource: 20 years and still GOing strong, Nucleic Acids Res Michael, 2016, Machine learning in genomic medicine: a review of computational problems and data sets, Proc IEEE Romero, 2016, Diet networks: thin parameters for fat genomics, arXiv Deng, 2009, ImageNet: a large-scale hierarchical image database, 2009, IEEE Conference on Computer Vision and Pattern Recognition Maas, 2011, Christopher Potts learning word vectors for sentiment analysis, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies Nelson, 2019, To embed or not: network embedding as a paradigm in computational biology, Front Genet, 10.3389/fgene.2019.00381 Zitnik, 2019, Machine learning for integrating data in biology and medicine: principles, practice, and opportunities, Inf Fusion, 10.1016/j.inffus.2018.09.012 Kleftogiannis, 2014, DEEP: a general computational framework for predicting enhancers, Nucleic Acids Res Xiong, 2014, The human splicing code reveals new insights into the genetic determinants of disease, Science Angermueller, 2016, Deep learning for computational biology, Mol Syst Biol, 10.15252/msb.20156651 Babak Alipanahi, 2015, Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning, Nat Biotechnol Zhou, 2015, Predicting effects of noncoding variants with deep learning–based sequence model, Nat Methods, 10.1038/nmeth.3547 Kelley, 2016, Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks, Genome Res, 10.1101/gr.200535.115 Avsec, 2019, Deep learning at base-resolution reveals motif syntax of the cis-regulatory code, Cold Spring Harbor Lab Yu, 2015, Multi-scale context aggregation by dilated convolutions, arXiv Gandhi, 2018, cDeepbind: a context sensitive deep learning model of RNA-protein binding, Cold Spring Harbor Lab Bogard, 2019, A deep neural network for predicting and engineering alternative polyadenylation, Cell, 10.1016/j.cell.2019.04.046 Ni, 2019, DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning, Bioinformatics, 10.1093/bioinformatics/btz276 Tian, 2019, MRCNN: a deep learning model for regression of genome-wide DNA methylation, BMC Genomics, 10.1186/s12864-019-5488-5 Singh, 2018, Attend and predict: understanding gene regulation by selective attention on chromatin, Cold Spring Harbor Lab Sekhon, 2018, DeepDiff: DEEP-learning for predicting DIFFerential gene expression from histone modifications, Bioinformatics, 10.1093/bioinformatics/bty612 Poplin, 2018, A universal SNP and small-indel variant caller using deep neural networks, Nat Biotechnol, 10.1038/nbt.4235 Luo, 2019, A multi-task convolutional deep neural network for variant calling in single molecule sequencing, Nat Commun Shariful Islam Bhuyan, 2019, SICaRiO: Short Indel Call filteRing with bOosting, Cold Spring Harbor Lab Curnin, 2019, Machine learning-based detection of insertions and deletions in the human genome, Cold Spring Harbor Lab Sahraeian, 2019, Deep convolutional neural networks for accurate somatic mutation detection, Nat Commun, 10.1038/s41467-019-09027-x Mao, 2019, Pathway-level information extractor (PLIER) for gene expression data, Nat Methods, 10.1038/s41592-019-0456-1 Subramanian, 2005, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, Proc Natl Acad Sci U S A, 10.1073/pnas.0506580102 Taroni, 2019, MultiPLIER: a transfer learning framework for transcriptomics reveals systemic features of rare disease, Cell Syst, 10.1016/j.cels.2019.04.003 Collado-Torres, 2017, Reproducible RNA-seq analysis using recount2, Nat Biotechnol, 10.1038/nbt.3838 Hao, 2018, PASNet: pathway-associated sparse deep neural network for prognosis prediction from high-throughput data, BMC Bioinf, 10.1186/s12859-018-2500-z Hofree, 2013, Network-based stratification of tumor mutations, Nat Methods, 10.1038/nmeth.2651 Elyanow, 2019, netNMF-sc: leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis, Cold Spring Harbor Lab Yang, 2016, COEXPEDIA: exploring biomedical hypotheses via co-expressions associated with medical subject headings (MeSH), Nucleic Acids Res Erion, 2019, Learning explainable models using attribution priors, arXiv Xi, 2017, A novel network regularized matrix decomposition method to detect mutated cancer genes in tumour samples with inter-patient heterogeneity, Sci Rep Manica, 2019, PIMKL: pathway-induced multiple kernel learning, NPJ Syst Biol Appl, 10.1038/s41540-019-0086-3 Zarringhalam, 2018, Robust phenotype prediction from gene expression data using differential shrinkage of co-regulated genes, Sci Rep, 10.1038/s41598-018-19635-0 Szklarczyk, 2014, STRING v10: protein–protein interaction networks, integrated over the tree of life, Nucleic Acids Res Kang, 2017, A biological network-based regularized artificial neural network model for robust phenotype prediction from gene expression data, BMC Bioinf, 10.1186/s12859-017-1984-2 Lin, 2017, Using neural networks for reducing the dimensions of single-cell RNA-Seq data, Nucleic Acids Res, 10.1093/nar/gkx681 Eetemadi, 2018, Genetic neural networks: an artificial neural network architecture for capturing gene expression relationships, Bioinformatics Wei, 2006, Nonparametric pathway-based regression models for analysis of genomic data, Biostatistics Li, 2008, Network-constrained regularization and variable selection for analysis of genomic data, Bioinformatics, 10.1093/bioinformatics/btn485 Dirmeier, 2017, netReg: network-regularized linear models for biological association studies, Bioinformatics Cheng, 2014, Graph-regularized dual Lasso for robust eQTL mapping, Bioinformatics, 10.1093/bioinformatics/btu293 Gao, 2019, Integrative analysis of genetical genomics data incorporating network structures, Biometrics, 10.1111/biom.13072 Ma, 2018, Using deep learning to model the hierarchical structure and function of a cell, Nat Methods, 10.1038/nmeth.4627 Kulmanov, 2017, DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier, Bioinformatics Kulmanov, 2019, DeepGOPlus: improved protein function prediction from sequence, Bioinformatics, 10.1093/bioinformatics/btz595 Xiao, 2018, Predictive modeling of microbiome data using a phylogeny-regularized generalized linear mixed model, Front Microbiol Xiao, 2018, A phylogeny-regularized sparse regression model for predictive modeling of microbial community data, Front Microbiol Lei, 2019, Tumor copy number deconvolution integrating bulk and single-cell sequencing data, Lect Notes Comput Sci Anafi, 2017, CYCLOPS reveals human transcriptional rhythms in health and disease, Proc Natl Acad Sci U S A, 10.1073/pnas.1619320114 Kirby, 1996, Circular nodes in neural networks, Neural Comput, 10.1162/neco.1996.8.2.390 Oskooei, 2018, Network-based biased tree ensembles (NetBiTE) for drug sensitivity prediction and drug sensitivity biomarker identification in cancer, arXiv Yang, 2012, Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells, Nucleic Acids Res, 10.1093/nar/gks1111 Staiger, 2012, A critical evaluation of network and pathway-based classifiers for outcome prediction in breast cancer, PLoS One, 10.1371/journal.pone.0034796 Bertin, 2019, Analysis of gene interaction graphs for biasing machine learning models, arXiv Hashir, 2019, Is graph-based feature selection of genes better than random?, arXiv