Microbiome Multi-Omics Network Analysis: Statistical Considerations, Limitations, and Opportunities

Duo Jiang1, Courtney R. Armour2, Chenxiao Hu1, Mei Meng1, Chuan Tian1, Thomas J. Sharpton2,1, Yuan Jiang1
1Department of Statistics, Oregon State University, United States
2Department of Microbiology, Oregon State University, United States

Tóm tắt

Từ khóa


Tài liệu tham khảo

Akavia, 2010, An integrated approach to uncover drivers of cancer, Cell, 143, 11, 10.1016/j.cell.2010.11.013

Albayrak, 2018, Detection of multi-dimensional co-exclusion patterns in microbial communities, Bioinformatics (Oxford, England)., 34, 10, 10.1093/bioinformatics/bty414

Alivisatos, 2015, A unified initiative to harness Earth’s microbiomes, Science, 350, 10, 10.1126/science.aac8480

Amano, 2018, Node property of weighted networks considering connectability to nodes within two degrees of separation, Sci. Rep., 8, 8464, 10.1038/s41598-018-26781-y

Aylward, 2015, Microbial community transcriptional networks are conserved in three domains at ocean basin scales, Proc. Natl Acad. Sci., 112, 10, 10.1073/pnas.1502883112

Bakker, 2018, Integration of multi-omics data and deep phenotyping enables prediction of cytokine responses, Nat. Immunol., 19, 10, 10.1038/s41590-018-0121-3

Ban, 2015, Investigating microbial co-occurrence patterns based on metagenomic compositional data, Bioinformatics, 31, 3322, 10.1093/bioinformatics/btv364

Barrat, 2003, The architecture of complex weighted networks, Proc. Natl. Acad. Sci., 101, 10

Bersanelli, 2016, Methods for the integration of multi-omics data: mathematical aspects, BMC Bioinformatics, 17, 10, 10.1186/s12859-015-0857-9

Bickel, 2006, Regularization in statistics, Test, 15, 271, 10.1007/BF02607055

Blaser, 2016, Toward a predictive understanding of earth’s microbiomes to address 21st century challenges, MBio, 10.1128/mBio.00714-16

Bouslimani, 2015, Molecular cartography of the human skin surface in 3D, Proc. Natl Acad. Sci., 112, E2120, 10.1073/pnas.1424409112

Buescher, 2016, Integration of omics: More than the sum of its parts, Cancer Metab., 4, 4, 10.1186/s40170-016-0143-y

Bullard, 2010, Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments, BMC Bioinformatics, 11, 94, 10.1186/1471-2105-11-94

Burges, 2009, Dimension Reduction: A Guided Tour. Found. Trends, Mach. Learn., 2, 275, 10.1561/2200000002.

Cai, 2013, Covariate-adjusted precision matrix estimation with an application in genetical genomics, Biometrika, 100, 139, 10.1093/biomet/ass058

Chaibub-Neto, 2010, Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes, Supplement, Ann. Appl. Stat., 4, 320, 10.1214/09-AOAS288SUPP

Charitou, 2016, Using biological networks to integrate, visualize and analyze genomics data, Genet. Sel. Evol., 48, 27, 10.1186/s12711-016-0205-1

Chen, 2013, Variable selection for sparse Dirichlet-multinomial regression with an application to microbiome data analysis, Ann. App. Stat., 7, 418, 10.1214/12-AOAS592

Cho, 2015, Diffusion component analysis: unraveling functional topology in biological networks, Research in Computational Molecular Biology, Lecture Notes in Computer Science, 62, 10.1007/978-3-319-16706-0_9

Cho, 2016, Compact integration of multi-network topology for functional analysis of genes, Cell Syst., 3, 540, 10.1016/j.cels.2016.10.017

Chun, 2013, Joint conditional Gaussian graphical models with multiple sources of genomic data, Front. Genet., 4, 294, 10.3389/fgene.2013.00294

Chun, 2010, Sparse partial least squares regression for simultaneous dimension reduction and variable selection, J. R. Stat. Soc. Ser. B Stat., 72, 3, 10.1111/j.1467-9868.2009.00723.x

Chun, 2015, Gene regulation network inference with joint sparse gaussian graphical models, J. Comput. Graph. Stat., 24, 954, 10.1080/10618600.2014.956876

Chung, 2010, Sparse partial least squares classification for high dimensional data, Stat. App. Genet. Mol. Biol., 9

Cranmer, 2017, Navigating the Range of Statistical Tools for Inferential Network Analysis, Am. J. Pol. Sci., 61, 237, 10.1111/ajps.12263

Daemen, 2009, A kernel-based integration of genome-wide data for clinical decision support, Genome Med., 1, 39, 10.1186/gm39

Danaher, 2014, The joint graphical lasso for inverse covariance estimation across multiple classes, J. R. Stat. Soc. Ser. B Stat., 76, 373, 10.1111/rssb.12033

Dao, 2019, A data integration multi-omics approach to study calorie restriction-induced changes in insulin sensitivity, Front. Physiol., 9

Dohlman, 2019, Mapping the microbial interactome: Statistical and experimental approaches for microbiome network inference, Exp. Biol. Med., 244, 445, 10.1177/1535370219836771

Dorogovtsev, 2003, Evolution of Networks: From Biological Nets to the Internet and WWW, 10.1093/acprof:oso/9780198515906.001.0001

Drton, 2017, Structure Learning in Graphical Modeling, Annu. Rev. Stat. Its Appl., 4, 365, 10.1146/annurev-statistics-060116-053803

Engel, 2011, A survey of dimension reduction methods for high-dimensional data analysis and visualization, oasics-OpenAccess Ser. Inf., 27, 135, 10.4230/OASIcs.VLUDS.2011.135

Fang, 2015, CCLasso: correlation inference for compositional data through Lasso, Bioinformatics, 31, 3172, 10.1093/bioinformatics/btv349

Faust, 2012, Microbial interactions: from networks to models, Nat. Rev. Microbiol., 10, 538, 10.1038/nrmicro2832

Faust, 2012, Microbial co-occurrence relationships in the human microbiome, PLoS Comput. Biol., 8, 10.1371/journal.pcbi.1002606

Follows, 2007, Emergent biogeography of microbial communities in a model ocean, Science, 315, 1843, 10.1126/science.1138544

Franzosa, 2015, Sequencing and beyond: integrating molecular “omics” for microbial community profiling, Nat. Rev. Microbiol., 13, 360, 10.1038/nrmicro3451

Friedman, 2008, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 432, 10.1093/biostatistics/kxm045

Friedman, 2012, Inferring correlation networks from genomic survey data, PLoS Comput. Biol., 8, 10.1371/journal.pcbi.1002687

Frost, 2018, A multi-omics approach for identifying important pathways and genes in human cancer, BMC Bioinformatics, 19, 479, 10.1186/s12859-018-2476-8

Fujita, 2017, A statistical method to distinguish functional brain networks, Front. Neurosci, 11

Furlotte, 2011, Mixed-model coexpression: calculating gene coexpression while accounting for expression heterogeneity, Bioinformatics, 27, i288, 10.1093/bioinformatics/btr221

Gade, 2011, Graph based fusion of miRNA and mRNA expression data improves clinical outcome prediction in prostate cancer, BMC Bioinformatics, 12, 488, 10.1186/1471-2105-12-488

Gao, 2015, Learning directed acyclic graphical structures with genetical genomics data, Bioinformatics, 31, 3953, 10.1093/bioinformatics/btv513

Gaulke, 2016, Triclosan exposure is associated with rapid restructuring of the microbiome in adult zebrafish, PLoS One, 11, 10.1371/journal.pone.0154632

2004, The Gene Ontology (GO) database and informatics resource, Nucleic Acids Res., 32, D258, 10.1093/nar/gkh036

Gloor, 2016, Compositional uncertainty should not be ignored in high-throughput sequencing data analysis, Aust J. Stat., 45, 73, 10.17713/ajs.v45i4.122

Goh, 2017, Bayesian sparse reduced rank multivariate regression, J. Multivariate Anal., 157, 14, 10.1016/j.jmva.2017.02.007

Gould, 2018, Microbiome interactions shape host fitness, Proc. Natl Acad. Sci., 115, E11951, 10.1073/pnas.1809349115

Griffiths-Jones, 2008, miRBase: tools for microRNA genomics, Nucleic Acids Res., 36, D154, 10.1093/nar/gkm952

Guo, 2011, Joint estimation of multiple graphical models, Biometrika, 98, 1, 10.1093/biomet/asq060

Haas, 2017, Designing and interpreting “multi-omic” experiments that may change our understanding of biology, Curr. Opin. Syst. Biol., 6, 37, 10.1016/j.coisb.2017.08.009

Hardoon, 2011, Sparse canonical correlation analysis, Mach. Learn., 83, 331, 10.1007/s10994-010-5222-7

Hasin, 2017, Multi-omics approaches to disease, Genome Biol., 18, 1, 10.1186/s13059-017-1215-1

He, 2018, Regional variation limits applications of healthy gut microbiome reference ranges and disease models, Nat. Med., 24, 1532, 10.1038/s41591-018-0164-x

Heintz-Buschart, 2016, Integrated multi-omics of the human gut microbiome in a case study of familial type 1 diabetes, Nat. Microbiol., 2, 1, 10.1038/nmicrobiol.2016.180

Hoerl, 1970, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12, 55, 10.1080/00401706.1970.10488634

Holmes, 2012, Dirichlet multinomial mixtures: Generative models for microbial metagenomics, PLoS One, 7, 10.1371/journal.pone.0030126

Hong, 2013, Canonical correlation analysis for RNA-seq co-expression networks, Nucleic Acids Res., 41, e95, 10.1093/nar/gkt145

Horvath, 2012, Aging effects on DNA methylation modules in human brain and blood tissue, Genome Biol., 13, R97, 10.1186/gb-2012-13-10-r97

Hu, 2011, Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial, Am. J. Drug Alcohol Abuse, 37, 367, 10.3109/00952990.2011.597280

Huang, 2017, More is better: recent progress in multi-omics data integration methods, Front. Genet., 8, 1, 10.3389/fgene.2017.00084

Huson, 2007, Performance of some correlation coefficients when applied to zero-clustered data, J. Mod. Appl. Stat. Methods, 6, 530, 10.22237/jmasm/1193890560

Isci, 2014, Bayesian network prior: network analysis of biological data using external knowledge, Bioinformatics, 30, 860, 10.1093/bioinformatics/btt643

Jovanović, 2014, The co-inertia approach in identification of specific microRNA in early and advanced atherosclerosis plaque, Med. Hypotheses, 83, 11, 10.1016/j.mehy.2014.04.019

Jovel, 2016, Characterization of the gut microbiome using 16S or shotgun metagenomics, Front. Microbiol., 7, 459, 10.3389/fmicb.2016.00459

Kadarmideen, 2011, Systems biology of ovine intestinal parasite resistance: Disease gene modules and biomarkers, Mol. BioSyst., 7, 235, 10.1039/c0mb00190b

Khanna, 2018, Using multi-scale genetic, neuroimaging and clinical data for predicting alzheimer’s disease and reconstruction of relevant biological mechanisms, Sci. Rep., 8, 1, 10.1038/s41598-018-29433-3

Kim, 2014, “Integration of DNA methylation, copy number variation, and gene expression for gene regulatory network inference and application to psychiatric disorders”, In proceedings–IEEE 14th International Conference on Bioinformics and Bioengineering, BIBE 2014, 238, 10.1109/BIBE.2014.71

Kim, 2017, Node-structured integrative gaussian graphical model guided by pathway information, Comput. Math. Methods Med., 1

Kim, 2009, A multivariate regression approach to association analysis of a quantitative trait network, Bioinformatics, 25, i204, 10.1093/bioinformatics/btp218

Kint, 2010, Integration of ‘omics data: does it lead to new insights into host-microbe interactions, Future Microbiol., 5, 313, 10.2217/fmb.10.1

Kleaveland, 2018, A network of noncoding regulatory RNAs acts in the mammalian brain, Cell, 174, 350, 10.1016/j.cell.2018.05.022

Korb, 2008, The causal interpretation of Bayesian networks, Stud. Comput. Intell., 83

Koski, 2014, A review of bayesian networks and structure learning, Math. Applicanda, 40

Kurtz, 2015, Sparse and compositionally robust inference of microbial ecological networks, PLoS Comput. Biol., 11, 10.1371/journal.pcbi.1004226

Lai, 2003, KERNEL and nonlinear canonical correlation analysis, Int. J. Neural Syst., 10, 365, 10.1142/s012906570000034x

Langfelder, 2008, WGCNA: an R package for weighted correlation network analysis, BMC Bioinformatics, 9, 559, 10.1186/1471-2105-9-559

Layeghifard, 2017, Disentangling interactions in the microbiome: a network perspective, Trends Microbiol., 25, 217, 10.1016/j.tim.2016.11.008

Lê Cao, 2008, A sparse PLS for variable selection when integrating omics data, Stat. App. Genet. Mol. Biol., 7

Lecca, 2015, Detecting modules in biological networks by edge weight clustering and entropy significance, Front. Genet., 6, 265, 10.3389/fgene.2015.00265

Lee, 2011, Sparse partial least-squares regression and its applications to high-throughput data analysis, Chemom. Intell. Lab. Syst., 109, 1, 10.1016/j.chemolab.2011.07.002

Li, 2012, Sparse estimation of conditional graphical models with application to gene networks, J. Am. Stat. Assoc., 107, 152, 10.1080/01621459.2011.644498

Li, 2019, A novel human microbe-disease association prediction method based on the bidirectional weighted network, Front. Microbiol., 10, 676, 10.3389/fmicb.2019.00676

Li, 2011, Integrative analysis of many weighted Co-Expression networks using tensor computation, PLoS Comput. Biol., 7, 10.1371/journal.pcbi.1001106

Li, 2018, A review on machine learning principles for multi-view biological data integration, Brief Bioinf., 19, 325, 10.1093/bib/bbw113

Lin, 2014, Variable selection in regression with compositional covariates, Biometrika, 101, 785, 10.1093/biomet/asu031

Lin, 2017, On joint estimation of Gaussian graphical models for spatial and temporal data, Biometrics, 73, 769, 10.1111/biom.12650

Liu, 2019, A statistical approach to participant selection in location-based social networks for offline event marketing, Information Sci., 480, 90, 10.1016/j.ins.2018.12.028

Lloyd-Price, 2019, Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases, Nature, 569, 655, 10.1038/s41586-019-1237-9

Love, 2014, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2, Genome Biol., 15, 550, 10.1186/s13059-014-0550-8

Luo, 2018, Nonparametric bayesian learning of heterogeneous dynamic transcription factor networks, Ann. Appl. Stat., 12, 1749, 10.1214/17-AOAS1129

Luo, 2017, A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information, Nat. Commun., 8, 573, 10.1038/s41467-017-00680-8

Ma’ayan, 2011, Introduction to network analysis in systems biology, Sci. Signaling, 4, tr5, 10.1126/scisignal.2001965

Maier, 2017, Impact of dietary resistant starch on the human gut microbiome, metaproteome, and metabolome, MBio, 8

Mainali, 2017, Statistical analysis of co-occurrence patterns in microbial presence-absence datasets, PLoS One, 12, 10.1371/journal.pone.0187132

Mandakovic, 2018, Structure and co-occurrence patterns in microbial communities under acute environmental stress reveal ecological factors fostering resilience, Sci. Rep., 8, 5875, 10.1038/s41598-018-23931-0

Mandal, 2015, Analysis of composition of microbiomes: a novel method for studying microbial composition, Microb. Ecol. Health Dis., 26

Martín-Fernández, 2003, Dealing with zeros and missing values in compositional data sets using nonparametric imputation, Math. Geol., 35, 253, 10.1023/A:1023866030544

Martín-Ferńandez, 2011, Dealing with Zeros, Compositional Data Analysis: Theory and Applications, 43, 10.1002/9781119976462.ch4

Martín-Fernández, 2015, Bayesian-multiplicative treatment of count zeros in compositional data sets, Stat. Modell., 15, 134, 10.1177/1471082X14535524

McCarthy, 2012, Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation, Nucleic Acids Res., 40, 4288, 10.1093/nar/gks042

McGrail, 2018, Multi-omics analysis reveals neoantigen-independent immune cell infiltration in copy-number driven cancers, Nat. Commun., 9, 1317, 10.1038/s41467-018-03730-x

McHardy, 2013, Integrative analysis of the microbiome and metabolome of the human intestinal mucosal surface reveals exquisite inter-relationships, Microbiome, 1, 17, 10.1186/2049-2618-1-17

McKnight, 2018, Methods for normalizing microbiome data: an ecological perspective, Methods Ecol. Evol., 10, 389, 10.1111/2041-210X.13115

McMurdie, 2014, Waste Not, Want Not: why rarefying microbiome data is inadmissible, PLoS Comput. Biol., 10, 10.1371/journal.pcbi.1003531

Meinshausen, 2006, High-dimensional graphs and variable selection with the Lasso, Ann. Stat., 34, 1436, 10.1214/009053606000000281

Meng, 2014, A multivariate approach to the integration of multi-omics datasets, BMC Bioinformatics, 15, 162, 10.1186/1471-2105-15-162

Meng, 2016, Dimension reduction techniques for the integrative analysis of multi-omics data, Brief Bioinf., 17, 628, 10.1093/bib/bbv108

Min, 2018, Penalized co-inertia analysis with applications to -omics data, Bioinformatics, 35, 1018, 10.1093/bioinformatics/bty726

Mirza, 2019, Machine learning and integrative analysis of biomedical big data, Genes, 10, 87, 10.3390/genes10020087

Mohammadnejad, 2019, “Weighted gene co-expression network analysis of microarray mRNA expression profiling in response to electroacupuncture”, In proceedings–2018 IEEE International Conference on Bioinformics and Biomedicinem, BIBM 2018, 1876

Moore, 2019, Computational approaches for the analysis of RNA–protein interactions: a primer for biologists, J. Biol. Chem., 294, 1, 10.1074/jbc.REV118.004842

Morgan, 2015, Associations between host gene expression, the mucosal microbiome, and clinical outcome in the pelvic pouch of patients with inflammatory bowel disease, Genome Biol., 16, 67, 10.1186/s13059-015-0637-x

Morgun, 2015, Uncovering effects of antibiotics on the host and microbiota using transkingdom gene networks, Gut., 64, 1732, 10.1136/gutjnl-2014-308820

Mostafavi, 2008, GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function, Genome Biol., 9, S4, 10.1186/gb-2008-9-s1-s4

Nayfach, 2015, Automated and accurate estimation of gene family abundance from shotgun metagenomes, PLoS Comput. Biol., 11, 10.1371/journal.pcbi.1004573

Newman, 2010, Networks: an introduction, Networks: An Introduction

Newman, 2004, ). Analysis of weighted networks, Phys. Rev. E: Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., 70, 056131, 10.1103/PhysRevE.70.056131

Ni, 2014, Integrative Bayesian network analysis of genomic data, Cancer Inf., 13, 39, 10.4137/CIn.s13786

Nie, 2006, Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins, Bioinformatics, 22, 1641, 10.1093/bioinformatics/btl134

Nie, 2006, Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: A multiple regression to identify sources of variations, Biochem. Biophys. Res Commun., 339, 603, 10.1016/j.bbrc.2005.11.055

Opsahl, 2009, Clustering in weighted networks, Soc. Networks, 31, 155, 10.1016/j.socnet.2009.02.002

Palarea-Albaladejo, 2015, ZCompositions - R package for multivariate imputation of left-censored data under a compositional approach, Chemom. Intell. Lab. Syst., 143, 85, 10.1016/j.chemolab.2015.02.019

Parkhomenko, 2007, Genome-wide sparse canonical correlation of gene expression with genotypes, BMC Proc., 1, S119, 10.1186/1753-6561-1-s1-s119

Parkhomenko, 2009, Sparse canonical correlation analysis with application to genomic data integration, Stat. App. Genet. Mol. Biol., 8, 1, 10.2202/1544-6115.1406

Paulson, 2013, Robust methods for differential abundance analysis in marker gene surveys, Nat. Methods, 10, 1200, 10.1038/nmeth.2658

Pavlopoulos, 2018, Bipartite graphs in systems biology and medicine: a survey of methods and applications, GigaScience, 7

Peng, 2012, Regularized multivariate regression for identifying master predictors with application to integrative genomics study of breast cancer, Ann. App. Stat., 4, 53, 10.1214/09-AOAS271

Peterson, 2015, Bayesian inference of multiple gaussian graphical models, J. Am. Stat. Assoc., 110, 159, 10.1080/01621459.2014.896806

Pfalzer, 2016, Interactions between the colonic transcriptome, metabolome, and microbiome in mouse models of obesity-induced intestinal cancer, Physiol. Genomics, 48, 545, 10.1152/physiolgenomics.00034.2016

Qin, 2014, Inferring gene regulatory networks by integrating ChIP-seq/chip and transcriptome data via LASSO-type regularization methods, Methods, 67, 294, 10.1016/j.ymeth.2014.03.006

Reverter, 2012, Kernel methods for dimensionality reduction applied to the «omics» data, Principal component analysis - multidisciplinary applications, 1, 10.5772/37431

Robinson, 2009, edgeR: a bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics, 26, 139, 10.1093/bioinformatics/btp616

Rodrigues, 2018, Transkingdom networks: a systems biology approach to identify causal members of host–microbiota interactions, Methods Mol. Biol., 227

Röttjers, 2018, From hairballs to hypotheses–biological insights from microbial networks, FEMS Microbiol. Rev., 42, 761, 10.1093/femsre/fuy030

Ruan, 2006, Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors, Bioinformatics, 22, 2532, 10.1093/bioinformatics/btl417

Schölkopf, 1997, Kernel principal component analysis BT - artificial neural networks — ICANN’97, Artificial Neural Networks — ICANN’97, 10.1007/BFb0020217

Sharpton, 2017, Development of inflammatory bowel disease is linked to a longitudinal restructuring of the gut metagenome in mice, MSystems, 2

Shi, 2016, Regression analysis for microbiome compositional data, Ann. App. Stat., 10, 1019, 10.1214/16-AOAS928

Shin, 2014, An atlas of genetic influences on human blood metabolites, Nat. Genet., 46, 543, 10.1038/ng.2982

Silk, 2017, The application of statistical network models in disease research, Methods Ecol. Evol., 8, 1026, 10.1111/2041-210X.12770

Städler, 2017, Molecular heterogeneity at the network level: high-dimensional testing, clustering and a TCGA case study, Bioinformatics, 33, 2890, 10.1093/bioinformatics/btx322

Sunagawa, 2015, Structure and function of the global ocean microbiome, Science, 348, 1261359, 10.1126/science.1261359

Suo, 2017, Sparse canonical correlation analysis

Tan, 2013, Exact Solutions of a Generalized Weighted Scale Free Network, J.Appl. Math., 2013, 1, 10.1155/2013/902519

Tang, 2018, Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis, Biostatistics, kxy025

Tap, 2015, Gut microbiota richness promotes its stability upon increased dietary fibre intake in healthy adults, Environ. Microbiol., 17, 4954, 10.1111/1462-2920.13006

Tapio, 2017, Taxon abundance, diversity, co-occurrence and network analysis of the ruminal microbiota in response to dietary changes in dairy cows, PLoS One, 12, 10.1371/journal.pone.0180260

Tenenhaus, 2014, Variable selection for generalized canonical correlation analysis, Biostatistics, 15, 569, 10.1093/biostatistics/kxu001

2012, Structure, function and diversity of the healthy human microbiome, Nature, 486, 207, 10.1038/nature11234

2014, The integrative human microbiome project: Dynamic analysis of microbiome-host omics profiles during periods of human health and disease, Cell Host Microbe, 16, 276, 10.1016/j.chom.2014.08.014

2019, The integrative human microbiome project, Nature

Theriot, 2016, Antibiotic-induced alterations of the gut microbiota alter secondary bile acid production and allow for clostridium difficile spore germination and outgrowth in the large intestine, MSphere, 1

Theriot, 2014, Antibiotic-induced shifts in the mouse gut microbiome and metabolome increase susceptibility to Clostridium difficile infection, Nat. Commun., 5, 3114, 10.1038/ncomms4114

Thompson, 2017, A communal catalogue reveals Earth’s multiscale microbial diversity, Nature, 551, 457, 10.1038/nature24621

Tibshirani, 1996, Regression Selection and Shrinkage via the Lasso, J. R. Stat. Soc. Ser. B Stat., 10.1111/j.2517-6161.1996.tb02080.x

Tong, 2013, A modular organization of the human intestinal mucosal microbiota and its association with inflammatory bowel disease, PLoS One, 8, 10.1371/journal.pone.0080702

Vandeputte, 2017, Quantitative microbiome profiling links gut community variation to microbial load, Nature, 551, 507, 10.1038/nature24460

Waaijenborg, 2008, Quantifying the association between gene expressions and DNA-markers by penalized canonical correlation analysis, Stat. App. Genet. Mol. Biol., 7, 1, 10.2202/1544-6115.1329

Wang, 2014, Similarity network fusion for aggregating data types on a genomic scale, Nat. Methods, 11, 333, 10.1038/nmeth.2810

Wang, 2019, Host and microbiome multi-omics integration: applications and methodologies, Biophys. Rev., 11, 55, 10.1007/s12551-018-0491-7

Wang, 2015, Exploiting ontology graph for predicting sparsely annotated gene function, Bioinformatics, 31, i357, 10.1093/bioinformatics/btv260

Wang, 2016, The identification of age-associated cancer markers by an integrative analysis of dynamic DNA methylation changes, Sci. Rep., 6, 22722, 10.1038/srep22722

Wani, 2018, Integrative approaches to reconstruct regulatory networks from multi-omics data: a review of state-of-the-art methods, Preprints, 1

Witten, 2009, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, 10, 515, 10.1093/biostatistics/kxp008

Witten, 2009, Extensions of sparse canonical correlation analysis with applications to genomic data, Stat. App. Genet. Mol. Biol., 8, 1, 10.2202/1544-6115.1470

Wu, 2019, A selective review of multi-level omics data integration using variable selection, High-Throughput, 8, 4, 10.3390/ht8010004

Xia, 2013, A logistic normal multinomial regression model for microbiome compositional data analysis, Biometrics, 69, 1053, 10.1111/biom.12079

Xu, 2015, Assessment and selection of competing models for zero-inflated microbiome data, PLoS One, 10, 10.1371/journal.pone.0129606

Xue, 2013, Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing, Nature, 500, 593, 10.1038/nature12364

Yang, 2009, Reconstruct modular phenotype-specific gene networks by knowledge-driven matrix factorization, Bioinformatics, 25, 2236, 10.1093/bioinformatics/btp376

Yang, 2017, Inference of environmental factor-microbe and microbe-microbe associations from metagenomic data using a hierarchical bayesian statistical model, Cell Syst., 4, 129, 10.1016/j.cels.2016.12.012

Yuan, 2018, Integration of multi-omics data for gene regulatory network inference and application to breast cancer, IEEE/ACM Transact. Comput. Biol. Bioinf, 16, 782, 10.1109/TCBB.2018.2866836

Yuan, 2007, Model selection and estimation in the Gaussian graphical model, Biometrika, 94, 19, 10.1093/biomet/asm018

Zaheer, 2018, Impact of sequencing depth on the characterization of the microbiome and resistome, Sci. Rep., 8, 5890, 10.1038/s41598-018-24280-8

Zaykin, 2002, Truncated product method for combining P-values, Genet. Epidemiol., 22, 170, 10.1002/gepi.0042

Zeng, 2018, Review of statistical learning methods in integrated omics studies (An integrated information science), Bioinf. Biol. Insights, 12, 1, 10.1177/1177932218759292

Zhang, 2005, A general framework for weighted gene co-expression network analysis, Stat. App. Genet. Mol. Biol, 4, 17, 10.2202/1544-6115.1128

Zhang, 2018, A global transcriptional network connecting noncoding mutations to changes in tumor gene expression, Nat. Genet., 50, 613, 10.1038/s41588-018-0091-2

Zhang, 2013, Inferring polymorphism-induced regulatory gene networks active in human lymphocyte cell lines by weighted linear mixed model analysis of multiple RNA-Seq datasets, PloS One, 8, 10.1371/journal.pone.0078868

Zhang, 2016, Differential network analysis from cross-platform gene expression data, Sci. Rep, 6, 34112, 10.1038/srep34112

Zhang, 2017, A statistical framework for data integration through graphical models with application to cancer genomics, Ann. App. Stat., 11, 10.1214/16-AOAS998

Zhang, 2015, Spectra of weighted scale-free networks, Sci. Rep, 5, 17469, 10.1038/srep17469

Zou, 2005, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B Stat., 67, 301, 10.1111/j.1467-9868.2005.00503.x