Correlation and association analyses in microbiome study integrating multiomics in health and disease

Yinglin Xia1
1Department of Medicine, University of Illinois at Chicago, Chicago, IL, United States

Tài liệu tham khảo

Zhang, 2019, Perspective and guidelines for metaproteomics in microbiome studies, J Proteome Res, 18, 2370, 10.1021/acs.jproteome.9b00054

Rodgers, 1988, Thirteen ways to look at the correlation coefficient, Am Stat, 42, 59, 10.2307/2685263

Bonett, 2005, Inferential methods for the tetrachoric correlation coefficient, J Educ Behav Stat, 30, 213, 10.3102/10769986030002213

Brossette, 1998, Association rules and data mining in hospital infection control and public health surveillance, J Am Med Inform Assoc, 5, 373, 10.1136/jamia.1998.0050373

Greenblum, 2012, Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease, Proc Natl Acad Sci USA, 109, 594, 10.1073/pnas.1116053109

Khamis, 2008, Measures of association: how to choose?, J Diagn Med Sonogr, 24, 155, 10.1177/8756479308317006

Wright, 2010, An automated technique for identifying associations between medications, laboratory results and problems, J Biomed Inform, 43, 891, 10.1016/j.jbi.2010.09.009

Xia, 2018, 29

Hahsler

Liebetrau, 1983

Baldi, 2000, Assessing the accuracy of prediction algorithms for classification: an overview, Bioinformatics, 16, 412, 10.1093/bioinformatics/16.5.412

Shadish, 2002

Al-Katib, 2007, Epididymal and testicular lesions in rams following experimental infection with Actinobacillus seminis, N Z Vet J, 55, 125, 10.1080/00480169.2007.36754

Cook, 1979

Locke, 1975

Moe, 2010, Detection of antibodies against fusobacterium necrophorum and Porphyromonas levii-like species in dairy cattle with papillomatous digital dermatitis, Microbiol Immunol, 54, 338, 10.1111/j.1348-0421.2010.00220.x

Berry, 2018, Chapter 1. Introduction

Reynolds, 1977

Fleiss, 2003

Paliy, 2016, Application of multivariate statistical techniques in microbial ecology, Mol Ecol, 25, 1032, 10.1111/mec.13536

Clarke, 2008, The properties of high-dimensional data spaces: implications for exploring gene and protein expression data, Nat Rev Cancer, 8, 37, 10.1038/nrc2294

Aitchison, 1986

Lovell, 2011, Proportions, percentages, PPM: do the molecular biosciences treat compositional data right?

Xia, 2018, Compositional analysis of microbiome data, 331

Friedman, 2012, Inferring correlation networks from genomic survey data, PLoS Comput Biol, 8, 10.1371/journal.pcbi.1002687

Eaton, 1983, 512

Steuer, 2002, The mutual information: detecting and evaluating dependencies between variables, Bioinformatics, 18, S231, 10.1093/bioinformatics/18.suppl_2.S231

Faust, 2012, Microbial co-occurrence relationships in the human microbiome, PLoS Comput Biol, 8, e1002606, 10.1371/journal.pcbi.1002606

Weiss, 2016, Correlation detection strategies in microbial data sets vary widely in sensitivity and precision, ISME J, 10, 1669, 10.1038/ismej.2015.235

Sohn, 2018, A GLM-based latent variable ordination method for microbiome samples, Biometrics, 74, 448, 10.1111/biom.12775

Paulson, 2013, Differential abundance analysis for microbial marker-gene surveys, Nat Methods, 10, 1200, 10.1038/nmeth.2658

Wang, 2016, Genome-wide association analysis identifies variation in vitamin D receptor and other host factors influencing the gut microbiota, Nat Genet, 48, 1396, 10.1038/ng.3695

Jiang, 2019, Microbiome multi-omics network analysis: statistical considerations, limitations, and opportunities, Front Genet, 10, 995, 10.3389/fgene.2019.00995

Chen, 2018, A system biology perspective on environment–host–microbe interactions, Hum Mol Genet, 27, R187, 10.1093/hmg/ddy137

Dai, 2018, Multi-cohort analysis of colorectal cancer metagenome identified altered bacteria across populations and universal bacterial markers, Microbiome, 6, 70, 10.1186/s40168-018-0451-2

Dai, 2018, Batch effects correction for microbiome data with Dirichlet-multinomial regression, Bioinformatics, 35, 807, 10.1093/bioinformatics/bty729

Gibbons, 2018, Correcting for batch effects in case-control microbiome studies, PLoS Comput Biol, 14, 10.1371/journal.pcbi.1006102

Costea, 2017, Towards standards for human fecal sample processing in metagenomic studies, Nat Biotechnol, 35, 1069, 10.1038/nbt.3960

Kennedy, 2014, The impact of different DNA extraction kits and laboratories upon the assessment of human gut microbiota composition by 16S rRNA gene sequencing, PLoS One, 9, 10.1371/journal.pone.0088982

Maukonen, 2012, The currently used commercial DNA-extraction methods give different results of clostridial and actinobacterial populations derived from human fecal samples, FEMS Microbiol Ecol, 79, 697, 10.1111/j.1574-6941.2011.01257.x

McOrist, 2002, A comparison of five methods for extraction of bacterial DNA from human faecal samples, J Microbiol Methods, 50, 131, 10.1016/S0167-7012(02)00018-0

Smith, 2011, Optimising bacterial DNA extraction from faecal samples: comparison of three methods, Open Microbiol J, 5, 14, 10.2174/1874285801105010014

Sinha, 2017, Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium, Nat Biotechnol, 35, 1077, 10.1038/nbt.3981

Song, 2016, Preservation methods differ in fecal microbiome stability, affecting suitability for field studies, mSystems, 1, 10.1128/mSystems.00021-16

Vandeputte, 2017, Practical considerations for large-scale gut microbiome studies, FEMS Microbiol Rev, 41, S154, 10.1093/femsre/fux027

Lloyd-Price, 2019, Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases, Nature, 569, 655, 10.1038/s41586-019-1237-9

Dhariwal, 2017, MicrobiomeAnalyst: a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data, Nucleic Acids Res, 45, W180, 10.1093/nar/gkx295

Duvallet, 2018, Meta-analysis generates and prioritizes hypotheses for translational microbiome research, J Microbial Biotechnol, 11, 273, 10.1111/1751-7915.13047

Duvallet, 2017, Meta-analysis of gut microbiome studies identifies disease-specific and shared responses, Nat Commun, 8, 1784, 10.1038/s41467-017-01973-8

Pasolli, 2016, Machine learning meta-analysis of large metagenomic datasets: tools and biological insights, PLoS Comput Biol, 12, 10.1371/journal.pcbi.1004977

Pearson, 1920, Notes on the history of correlation, Biometrika, 13, 25, 10.1093/biomet/13.1.25

Pearson, 1896, Mathematical contributions to the theory of evolution. III. Regression, heredity, and panmixia, Philos Trans R Soc Lond Ser A, 187, 253, 10.1098/rsta.1896.0007

Theriot, 2014, Antibiotic-induced shifts in the mouse gut microbiome and metabolome increase susceptibility to Clostridium difficile infection, Nat Commun, 5, 3114, 10.1038/ncomms4114

Weir, 2013, Stool microbiome and metabolome differences between colorectal cancer patients and healthy adults, PLoS One, 8, e70803, 10.1371/journal.pone.0070803

Kendall, 1955

Yule, 1950

You, 2019, Evaluation of metabolite-microbe correlation detection methods, Anal Biochem, 567, 106, 10.1016/j.ab.2018.12.008

Ammons, 2015, Biochemical association of metabolic profile and microbiome in chronic pressure ulcer wounds, PLoS One, 10, e0126735, 10.1371/journal.pone.0126735

Gilbert, 2016, Microbiome-wide association studies link dynamic microbial consortia to disease, Nature, 535, 94, 10.1038/nature18850

Wu, 2019, A selective review of multi-level omics data integration using variable selection, High Throughput, 8, 4, 10.3390/ht8010004

Kendall, 1938, A new measure of rank correlation, Biometrika, 30, 81, 10.2307/2332226

Kendall, 1948

Kendall, 1970

Zar, 2010

Stuart, 1953, The estimation and comparison of strengths of association in contingency tables, Biometrika, 40, 105, 10.2307/2333101

Somers, 1962, A similarity between Goodman and Kruskal's Tau and Kendall's Tau, with a partial interpretation of the latter, J Am Stat Assoc, 57, 804, 10.1080/01621459.1962.10500818

Goodman, 1959, Measures of association for cross classifications. II: further discussion and references, J Am Stat Assoc, 54, 123, 10.1080/01621459.1959.10501503

Zhang, 2017, A multivariate distance-based analytic framework for microbial interdependence association test in longitudinal study, Genet Epidemiol, 41, 769, 10.1002/gepi.22065

Wu, 2016, Cigarette smoking and the oral microbiome in a large study of American adults, ISME J, 10, 2435, 10.1038/ismej.2016.37

Fisher, 1958

Hutchinson, 1993, Kappa muddles together two sources of disagreement: tetrachoric correlation is preferable, Res Nurs Health, 16, 313, 10.1002/nur.4770160410

Boughorbel, 2017, Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric, PLoS One, 12, 10.1371/journal.pone.0177678

Westcott, 2015, De novo clustering methods outperform reference-based methods for assigning 16S rRNA gene sequences to operational taxonomic units, PeerJ, 3, e1487, 10.7717/peerj.1487

Schloss, 2011, Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis, Appl Environ Microbiol, 77, 3219, 10.1128/AEM.02810-10

Pearson, 1900, X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling, Lond Edinb Dubl Phil Mag J Sci, 50, 157, 10.1080/14786440009463897

Plackett, 1983, Karl pearson and the chi-squared test, Int Stat Rev, 51, 59, 10.2307/1402731

Cougoul, 2019, Rarity of microbial species: in search of reliable associations, PLoS One, 14, 10.1371/journal.pone.0200458

Cramér, 1946, Chapter 21. The two-dimensional case, 282

Guilford, 1936

Yule, 1912, On the methods of measuring association between two attributes, J R Stat Soc, 75, 579, 10.2307/2340126

Sheskin, 2011

La Rosa, 2012, Hypothesis testing and power calculations for taxonomic-based human microbiome data, PLoS One, 7, 10.1371/journal.pone.0052078

Goodman, 1979, 2

Edwards, 1963, The measure of association in a 2×2 table, J R Stat Soc Ser A, 126, 109, 10.2307/2982448

Morris, 1988, Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates, Br Med J (Clin Res Ed), 296, 1313, 10.1136/bmj.296.6632.1313

Feinstein, 1973, Clinical biostatistics; xx. The epidemiologic trohoc, the ablative risk ratio, and ‘retrospective’ research, Clin Pharmacol Ther, 14, 291, 10.1002/cpt1973142291

Ahn, 2013, Human gut microbiome and risk for colorectal cancer, J Natl Cancer Inst, 105, 1907, 10.1093/jnci/djt300

Gill, 2006, Metagenomic analysis of the human distal gut microbiome, Science (New York, N.Y.), 312, 1355, 10.1126/science.1124234

Schmitt, 2019, Gut microbiome patterns correlate with higher postoperative complication rates after pancreatic surgery, BMC Microbiol, 19, 42, 10.1186/s12866-019-1399-5

Yule, 1900, On the association of attributes in statistics: with illustrations from the material of the childhood society, &c, Philos Trans R Soc Lond Ser A, 194, 257, 10.1098/rsta.1900.0019

Egozcue, 2018, Linear association in compositional data analysis, Aust J Stat, 47, 3, 10.17713/ajs.v47i1.689

de Goffau, 2019, Human placenta has no microbiome but can contain potential pathogens, Nature, 572, 329, 10.1038/s41586-019-1451-5

Kim, 2019, Gut microbiota and risk of persistent nonalcoholic fatty liver diseases, J Clin Med, 8, 1089, 10.3390/jcm8081089

Meier, 2019, A Bayesian framework for identifying consistent patterns of microbial abundance between body sites, Stat Appl Genet Mol Biol, 18, 10.1515/sagmb-2019-0027

Jackson, 2018, Detection of stable community structures within gut microbiota co-occurrence networks from different human populations, PeerJ, 6, e4303, 10.7717/peerj.4303

Jackson, 2018, Gut microbiota associations with common diseases and prescription medications in a population-based cohort, Nat Commun, 9, 2655, 10.1038/s41467-018-05184-7

de Meij, 2016, Composition and stability of intestinal microbiota of healthy children within a Dutch population, FASEB J, 30, 1512, 10.1096/fj.15-278622

Drell, 2017, The influence of different maternal microbial communities on the development of infant gut and oral microbiota, Sci Rep, 7, 9940, 10.1038/s41598-017-09278-y

Jaccard, 1908, Nouvelles recherches sur la distribution orale, Bull Soc Vaud Sci Nat, 44, 223

van Rijsbergen, 1979

Xia, 2018, Community diversity measures and calculations, 167

Xia, 2018, Multivariate community analysis, 285

Boutin, 2015, Comparison of microbiomes from different niches of upper and lower airways in children and adolescents with cystic fibrosis, PLoS One, 10, e0116029, 10.1371/journal.pone.0116029

Mainali, 2017, Statistical analysis of co-occurrence patterns in microbial presence-absence datasets, PLoS One, 12, 10.1371/journal.pone.0187132

Wang, 2018, GePMI: a statistical model for personal intestinal microbiome identification, NPJ Biofilms Microbiomes, 4, 20, 10.1038/s41522-018-0065-2

Cover, 2006

Li, 2019, Optimal microbiome networks: macroecology and criticality, Entropy, 21, 506, 10.3390/e21050506

Martín, 2018, Enterotype-like microbiome stratification as emergent structure in complex adaptive systems: a mathematical model, bioRxiv

Menon, 2018, Interactions between species introduce spurious associations in microbiome studies, PLoS Comput Biol, 14, e1005939, 10.1371/journal.pcbi.1005939

Reshef, 2011, Detecting novel associations in large data sets, Science (New York, N.Y.), 334, 1518, 10.1126/science.1205438

Cho, 2012, The human microbiome: at the interface of health and disease, Nat Rev Genet, 13, 260, 10.1038/nrg3182

Pinto, 2014, Spatial-temporal survey and occupancy-abundance modeling to predict bacterial community dynamics in the drinking water microbiome, mBio, 5, 10.1128/mBio.01135-14

Breiman, 1984

Malmuthuge, 2016, Gut microbiome and omics: a new definition to ruminant production and health, Anim Front, 6, 8, 10.2527/af.2016-0017

Janzon, 2019, Interactions between the gut microbiome and mucosal immunoglobulins A, M, and G in the developing infant gut, mSystems, 4, e00612, 10.1128/mSystems.00612-19

Kobayashi, 2018, Numerical analyses of intestinal microbiota by data mining, J Clin Biochem Nutr, 62, 124, 10.3164/jcbn.17-84

McCarthy, 2012, Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation, Nucleic Acids Res, 40, 4288, 10.1093/nar/gks042

Bullard, 2010, Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments, BMC Bioinf, 11, 94, 10.1186/1471-2105-11-94

McKnight, 2019, Methods for normalizing microbiome data: an ecological perspective, Methods Ecol Evol, 10, 389, 10.1111/2041-210X.13115

Chen, 2018, GMPR: a robust normalization method for zero-inflated count data with application to microbiome sequencing data, PeerJ, 6, e4600, 10.7717/peerj.4600

Morton, 2017, Balance trees reveal microbial niche differentiation, mSystems, 2, 10.1128/mSystems.00162-16

Vallejos, 2017, Normalizing single-cell RNA sequencing data: challenges and opportunities, Nat Methods, 14, 565, 10.1038/nmeth.4292

Giraldez, 2019, Phospho-RNA-seq: a modified small RNA-seq method that reveals circulating mRNA and lncRNA fragments as potential biomarkers in human plasma, EMBO J, 38, 10.15252/embj.2019101695

Lee, 2018, Enrichment of gut-derived Fusobacterium is associated with suboptimal immune recovery in HIV-infected individuals, Sci Rep, 8, 14277, 10.1038/s41598-018-32585-x

Biswas, 2016, Learning microbial interaction networks from metagenomic count data, J Comput Biol, 23, 526, 10.1089/cmb.2016.0061

Linden, 2008, Mucins in the mucosal barrier to infection, Mucosal Immunol, 1, 183, 10.1038/mi.2008.5

Fang, 2015, CCLasso: correlation inference for compositional data through Lasso, Bioinformatics, 31, 3172, 10.1093/bioinformatics/btv349

Yoon, 2019, Microbial networks in SPRING—semi-parametric rank-based correlation and partial correlation estimation for quantitative microbiome data, Front Genet, 10, 516, 10.3389/fgene.2019.00516

Schwager

Kostic, 2015, The dynamics of the human infant gut microbiome in development and in progression toward type 1 diabetes, Cell Host Microbe, 17, 260, 10.1016/j.chom.2015.01.001

Daquigan, 2017, High-resolution profiling of the gut microbiome reveals the extent of Clostridium difficile burden, NPJ Biofilms Microbiomes, 3, 35, 10.1038/s41522-017-0043-0

Esan, 2019, Exploring the long-term effect of plastic on compost microbiome, PLoS One, 14, 10.1371/journal.pone.0214376

Wirbel, 2019, Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer, Nat Med, 25, 679, 10.1038/s41591-019-0406-6

Xiao, 2018, A phylogeny-regularized sparse regression model for predictive modeling of microbial community data, Front Microbiol, 9, 3112, 10.3389/fmicb.2018.03112

Meier, 2008, The group LASSO for logistic regression, J R Stat Soc B, 70, 53, 10.1111/j.1467-9868.2007.00627.x

Meier, 2018

Bickel, 2009, Simultaneous analysis of Lasso and Dantzig selector, Ann Stat, 37, 1705, 10.1214/08-AOS620

Muenchhoff, 2016, Nonprogressing HIV-infected children share fundamental immunological features of nonpathogenic SIV infection, Sci Transl Med, 8, 10.1126/scitranslmed.aag1048

Ravikumar, 2010, High-dimensional Ising model selection using 1-regularized logistic regression, Ann Stat, 38, 1287, 10.1214/09-AOS691

van de Geer, 2014, On asymptotically optimal confidence regions and tests for high-dimensional models, Ann Stat, 42, 1166, 10.1214/14-AOS1221

Simon, 2013, A sparse-group Lasso, J Comput Graph Stat, 22, 231, 10.1080/10618600.2012.681250

Simon, 2018

Garcia, 2013, Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data, Bioinformatics, 30, 831, 10.1093/bioinformatics/btt608

Liquet, 2015, Group and sparse group partial least square approaches applied in genomics context, Bioinformatics, 32, 35, 10.1093/bioinformatics/btv535

Zhai, 2018, Variance component selection with applications to microbiome taxonomic data, Front Microbiol, 9, 509, 10.3389/fmicb.2018.00509

Friedman, 2007, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 432, 10.1093/biostatistics/kxm045

Kurtz, 2015, Sparse and compositionally robust inference of microbial ecological networks, PLoS Comput Biol, 11, 10.1371/journal.pcbi.1004226

Lo, 2017, MPLasso: Inferring microbial association networks using prior microbial knowledge, PLoS Comput Biol, 13, e1005915, 10.1371/journal.pcbi.1005915

McGregor, 2020, MDiNE: a model to estimate differential co-occurrence networks in microbiome studies, Bioinformatics, 36, 1840, 10.1093/bioinformatics/btz824

Bálint, 2016, Millions of reads, thousands of taxa: microbial community structure and associations analyzed via marker genes, FEMS Microbiol Rev, 40, 686, 10.1093/femsre/fuw017

Knight, 2018, Best practices for analysing microbiomes, Nat Rev Microbiol, 16, 410, 10.1038/s41579-018-0029-9

Silverman, 2017, A phylogenetic transform enhances analysis of compositional microbiota data, Elife, 6, 10.7554/eLife.21887

Ban, 2015, Investigating microbial co-occurrence patterns based on metagenomic compositional data, Bioinformatics (Oxford, England), 31, 3322, 10.1093/bioinformatics/btv364

Schwager, 2017, A Bayesian method for detecting pairwise associations in compositional data, PLoS Comput Biol, 13, e1005852, 10.1371/journal.pcbi.1005852

Dethlefsen, 2007, An ecological and evolutionary perspective on human-microbe mutualism and disease, Nature, 449, 811, 10.1038/nature06245

Cardona, 2016, Network-based metabolic analysis and microbial community modeling, Curr Opin Microbiol, 31, 124, 10.1016/j.mib.2016.03.008

Faust, 2015, Cross-biome comparison of microbial association networks, Front Microbiol, 6, 1200, 10.3389/fmicb.2015.01200

Dohlman, 2019, Mapping the microbial interactome: statistical and experimental approaches for microbiome network inference, Exp Biol Med (Maywood), 244, 445, 10.1177/1535370219836771

Abu-Ali, 2018, Metatranscriptome of human faecal microbial communities in a cohort of adult men, Nat Microbiol, 3, 356, 10.1038/s41564-017-0084-4

Chiquet, 2018

Gevers, 2014, The treatment-naive microbiome in new-onset Crohn's disease, Cell Host Microbe, 15, 382, 10.1016/j.chom.2014.02.005

Morton, 2019, Learning representations of microbe–metabolite interactions, Nat Methods, 16, 1306, 10.1038/s41592-019-0616-3

Mahana, 2016, Antibiotic perturbation of the murine gut microbiome enhances the adiposity, insulin resistance, and liver disease associated with high-fat diet, Genome Med, 8, 48, 10.1186/s13073-016-0297-9

Fuhrman, 2008, Community structure of marine bacterioplankton: patterns, networks, and relationships to function, Aquat Microb Ecol, 53, 69, 10.3354/ame01222

Agler, 2016, Microbial hub taxa link host and abiotic factors to plant microbiome variation, PLoS Biol, 14, 10.1371/journal.pbio.1002352

Steele, 2011, Marine bacterial, archaeal and protistan association networks reveal ecological linkages, ISME J, 5, 1414, 10.1038/ismej.2011.24

Fisher, 2014, Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression, PLoS One, 9, e102451, 10.1371/journal.pone.0102451

Patti, 2012, Metabolomics: the apogee of the omics trilogy, Nat Rev Mol Cell Biol, 13, 263, 10.1038/nrm3314

Chong, 2017, Computational approaches for integrative analysis of the metabolome and microbiome, Metabolites, 7, 62, 10.3390/metabo7040062

Human Microbiome Project, C, 2012, Structure, function and diversity of the healthy human microbiome, Nature, 486, 207, 10.1038/nature11234

Johnson, 2016, Metabolite and microbiome interplay in cancer immunotherapy, Cancer Res, 76, 6146, 10.1158/0008-5472.CAN-16-0309

Lee, 2018, Heterogeneity of microbiota dysbiosis in chronic rhinosinusitis: potential clinical implications and microbial community mechanisms contributing to sinonasal inflammation, Front Cell Infect Microbiol, 8, 168, 10.3389/fcimb.2018.00168

Levy, 2013, Metabolic modeling of species interaction in the human microbiome elucidates community-level assembly rules, Proc Natl Acad Sci USA, 110, 12804, 10.1073/pnas.1300926110

Kundu, 2019, Species-wide metabolic interaction network for understanding natural lignocellulose digestion in termite gut microbiota, Sci Rep, 9, 16329, 10.1038/s41598-019-52843-w

Levy, 2014, Metagenomic systems biology and metabolic modeling of the human microbiome: from species composition to community assembly rules, Gut microbes, 5, 265, 10.4161/gmic.28261

Sung, 2017, Global metabolic interaction network of the human gut microbiota for context-specific community-scale analysis, Nat Commun, 8, 15393, 10.1038/ncomms15393

Mallick, 2019, Predictive metabolomic profiling of microbial communities using amplicon or metagenomic sequences, Nat Commun, 10, 3136, 10.1038/s41467-019-10927-1

Noecker, 2016, Metabolic model-based integration of microbiome taxonomic and metabolomic profiles elucidates mechanistic links between ecological and metabolic variation, mSystems, 1, e00013, 10.1128/mSystems.00013-15

Segata, 2013, Computational meta'omics for microbial community studies, Mol Syst Biol, 9, 666, 10.1038/msb.2013.22

Garza, 2018, Towards predicting the environmental metabolome from metagenomics with a mechanistic model, Nat Microbiol, 3, 456, 10.1038/s41564-018-0124-8

Larsen, 2015, Metabolome of human gut microbiome is predictive of host dysbiosis, GigaScience, 4, 42, 10.1186/s13742-015-0084-3

Mason, 2014, Metagenomics reveals sediment microbial community response to Deepwater Horizon oil spill, ISME J, 8, 1464, 10.1038/ismej.2013.254

Abubucker, 2012, Metabolic reconstruction for metagenomic data and its application to the human microbiome, PLoS Comput Biol, 8, 10.1371/journal.pcbi.1002358

Aagaard, 2014, The placenta harbors a unique microbiome, Sci Transl Med, 6, 237ra265, 10.1126/scitranslmed.3008599

Caspi, 2013, The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases, Nucleic Acids Res, 42, D459, 10.1093/nar/gkt1103

Nishida, 2014, KEGGscape: a Cytoscape app for pathway data integration, F1000Res, 3, 144, 10.12688/f1000research.4524.1

Vázquez-Baeza, 2018, Impacts of the human gut microbiome on therapeutics, Annu Rev Pharmacol Toxicol, 58, 253, 10.1146/annurev-pharmtox-042017-031849

Starr, 2018, Proteomic and metaproteomic approaches to understand host–microbe interactions, Anal Chem, 90, 86, 10.1021/acs.analchem.7b04340

Stinson, 2019, The not-so-sterile womb: evidence that the human fetus is exposed to bacteria prior to birth, Front Microbiol, 10, 1124, 10.3389/fmicb.2019.01124

Stull, 2018, Impact of edible cricket consumption on gut microbiota in healthy adults, a double-blind, randomized crossover trial, Sci Rep, 8, 10762, 10.1038/s41598-018-29032-2

Aßhauer, 2015, Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data, Bioinformatics (Oxford, England), 31, 2882, 10.1093/bioinformatics/btv287

Iwai, 2016, Piphillin: improved prediction of metagenomic content by direct inference from human microbiomes, PLoS One, 11, e0166104, 10.1371/journal.pone.0166104

Sampson, 2016, Gut microbiota regulate motor deficits and neuroinflammation in a model of Parkinson's disease, Cell, 167, 1469, 10.1016/j.cell.2016.11.018

Thompson, 2017, A communal catalogue reveals Earth's multiscale microbial diversity, Nature, 551, 457, 10.1038/nature24621

Markowitz, 2012, IMG: the Integrated Microbial Genomes database and comparative analysis system, Nucleic Acids Res, 40, D115, 10.1093/nar/gkr1044

Markowitz, 2014, IMG 4 version of the integrated microbial genomes comparative analysis system, Nucleic Acids Res, 42, D560, 10.1093/nar/gkt963

Bautista, 2016, Emerging investigators series: microbial communities in full-scale drinking water distribution systems—a meta-analysis, Environ Sci Water Res Technol, 2, 631, 10.1039/C6EW00030D

Bian, 2017, Gut microbiome response to sucralose and its potential role in inducing liver inflammation in mice, Front Physiol, 8, 487, 10.3389/fphys.2017.00487

Camarinha-Silva, 2017, Host genome influence on gut microbial composition and microbial prediction of complex traits in pigs, Genetics, 206, 1637, 10.1534/genetics.117.200782

Mukherjee, 2017, Bioinformatic approaches including predictive metagenomic profiling reveal characteristics of bacterial response to petroleum hydrocarbon contamination in diverse environments, Sci Rep, 7, 1108, 10.1038/s41598-017-01126-3

Schloss, 2009, Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities, Appl Environ Microbiol, 75, 7537, 10.1128/AEM.01541-09

Cole, 2014, Ribosomal Database Project: data and tools for high throughput rRNA analysis, Nucleic Acids Res, 42, D633, 10.1093/nar/gkt1244

Abia, 2018, Metagenomic analysis of the bacterial communities and their functional profiles in water and sediments of the Apies River, South Africa, as a function of land use, Sci Total Environ, 616–617, 326, 10.1016/j.scitotenv.2017.10.322

Bates, 2018, Amphibian chytridiomycosis outbreak dynamics are linked with host skin bacterial community structure, Nat Commun, 9, 693, 10.1038/s41467-018-02967-w

Franzosa, 2014, Relating the metatranscriptome and metagenome of the human gut, Proc Natl Acad Sci USA, 111, E2329, 10.1073/pnas.1319284111

Gosalbes, 2011, Metatranscriptomic approach to analyze the functional human gut microbiota, PLoS One, 6, 10.1371/journal.pone.0017447

Verberkmoes, 2009, Shotgun metaproteomics of the human distal gut microbiota, ISME J, 3, 179, 10.1038/ismej.2008.108

Perez-Cobas, 2013, Gut microbiota disturbance during antibiotic therapy: a multi-omic approach, Gut, 62, 1591, 10.1136/gutjnl-2012-303184

Chang, 2020, Chemical mechanisms of colonization resistance by the gut microbial metabolome, ACS Chem Biol, 10.1021/acschembio.9b00813

Tolosana-Delgado, 2019

Morton, 2019, Revisiting microbe-metabolite interactions: doing better than random, bioRxiv

Mann, 2014, Mucosa-associated bacterial microbiome of the gastrointestinal tract of weaned pigs and dynamics linked to dietary calcium-phosphorus, PLoS One, 9, e86950, 10.1371/journal.pone.0086950

Dutilh, 2014, A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes, Nat Commun, 5, 4498, 10.1038/ncomms5498

Ridaura, 2013, Gut microbiota from twins discordant for obesity modulate metabolism in mice, Science, 341, 1241214, 10.1126/science.1241214

Stein, 2013, Ecological modeling from time-series inference: insight into dynamics and stability of intestinal microbiota, PLoS Comput Biol, 9, 10.1371/journal.pcbi.1003388

Hotelling, 1935, Relations between two sets of variates, Biometrika, 28, 321, 10.1093/biomet/28.3-4.321

Jolliffe, 2002

Johnstone, 2009, On consistency and sparsity for principal components analysis in high dimensions, J Am Stat Assoc, 104, 682, 10.1198/jasa.2009.0121

Johnstone, 2009

Legendre, 2012

ter Braak, 2015, Topics in constrained and unconstrained ordination, Plant Ecol, 216, 683, 10.1007/s11258-014-0356-5

Parkhomenko, 2009, Sparse canonical correlation analysis with application to genomic data integration, Stat Appl Genet Mol Biol, 8, 1, 10.2202/1544-6115.1406

Fukuyama, 2019, Adaptive gPCA: a method for structured dimensionality reduction with applications to microbiome data, Ann Appl Stat, 13, 1043, 10.1214/18-AOAS1227

Jolliffe, 2003, A modified principal component technique based on the LASSO, J Comput Graph Stat, 12, 531, 10.1198/1061860032148

Silverman, 1996, Smoothed functional principal components analysis by choice of norm, Ann Stat, 24, 1, 10.1214/aos/1033066196

Sui, 2019, Mucosal vaccine efficacy against intrarectal SHIV is independent of anti-Env antibody response, J Clin Invest, 129, 1314, 10.1172/JCI122110

Hirschfeld, 1935, A connection between correlation and contingency, Math Proc Camb Philos Soc, 31, 520, 10.1017/S0305004100013517

Benzécri, 1973, L'Analyse des Données, vol. II

Gomez, 2015, Gut microbiome composition and metabolomic profiles of wild western lowland gorillas (Gorilla gorilla gorilla) reflect host ecology, Mol Ecol, 24, 2551, 10.1111/mec.13181

Jakobsson, 2010, Short-term antibiotic treatment has differing long-term impacts on the human throat and gut microbiome, PLoS One, 5, 10.1371/journal.pone.0009836

Nogueira, 2015, Microbiomes and potential metabolic pathways of pristine and anthropized Brazilian mangroves, Reg Stud Mar Sci, 2, 56, 10.1016/j.rsma.2015.08.008

Zhang, 2017, Human and rat gut microbiome composition is maintained following sleep restriction, Proc Natl Acad Sci USA, 114, E1564, 10.1073/pnas.1620673114

Jovel, 2016, Characterization of the gut microbiome using 16S or shotgun metagenomics, Front Microbiol, 7, 459, 10.3389/fmicb.2016.00459

Khine, 2019, Gut microbiome of pre-adolescent children of two ethnicities residing in three distant cities, Sci Rep, 9, 7831, 10.1038/s41598-019-44369-y

Ross, 2018, Comprehensive skin microbiome analysis reveals the uniqueness of human skin and evidence for phylosymbiosis within the class Mammalia, Proc Natl Acad Sci USA, 115, E5786, 10.1073/pnas.1801302115

Antharam, 2016, An integrated metabolomic and microbiome analysis identified specific gut microbiota associated with fecal cholesterol and coprostanol in clostridium difficile infection, PLoS One, 11, e0148824, 10.1371/journal.pone.0148824

Lewis, 2017, The fecal microbial community of breast-fed infants from Armenia and Georgia, Sci Rep, 7, 40932, 10.1038/srep40932

Anderson, 2003, Canonical analysis of principal coordinates: a useful method of constrained ordination for ecology, Ecology, 84, 511, 10.1890/0012-9658(2003)084[0511:CAOPCA]2.0.CO;2

Ter Braak, 1988, A theory of gradient analysis, vol. 18, 271, 10.1016/S0065-2504(08)60183-X

Pérez-Jaramillo, 2017, Linking rhizosphere microbiome composition of wild and domesticated Phaseolus vulgaris to genotypic and root phenotypic traits, ISME J, 11, 2244, 10.1038/ismej.2017.85

Zhang, 2016, Ecological robustness of the gut microbiota in response to ingestion of transient food-borne microbes, ISME J, 10, 2235, 10.1038/ismej.2016.13

Bork, 2005, Towards cellular systems in 4D, Cell, 121, 507, 10.1016/j.cell.2005.05.001

Palsson, 2004, Two-dimensional annotation of genomes, Nat Biotechnol, 22, 1218, 10.1038/nbt1004-1218

Reed, 2006, Towards multidimensional genome annotation, Nat Rev Genet, 7, 130, 10.1038/nrg1769

Purdom, 2005, Error distribution for gene expression data, Stat Appl Genet Mol Biol, 4, 10.2202/1544-6115.1070

Zou, 2006, Sparse principal component analysis, J Comput Graph Stat, 15, 265, 10.1198/106186006X113430

Martino, 2019, A novel sparse compositional technique reveals microbial perturbations, mSystems, 4, e00016, 10.1128/mSystems.00016-19

Hyvärinen, 2000, Indepedent component analysis: algorithms and applications, Neural Netw, 13, 411, 10.1016/S0893-6080(00)00026-5

van Velzen, 2008, Multilevel data analysis of a crossover designed human nutritional intervention study, J Proteome Res, 7, 4483, 10.1021/pr800145j

Steinfath, 2008, Metabolite profile analysis: from raw data to regression and classification, Physiol Plant, 132, 150, 10.1111/j.1399-3054.2007.01006.x

Schölkopf, 1997, Kernel principal component analysis, 10.1007/BFb0020217

Schölkopf, 1998, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput, 10, 1299, 10.1162/089976698300017467

Loncar-Turukalo, 2019

Shiokawa, 2018, Application of kernel principal component analysis and computational machine learning to exploration of metabolites strongly associated with diet, Sci Rep, 8, 3426, 10.1038/s41598-018-20121-w

Landgraf, 2019, Generalized principal component analysis: projection of saturated model parameters, Technometrics, 1, 10.1080/00401706.2019.1668854

Vidal, 2004, A new GPCA algorithm for clustering subspaces by fitting, differentiating and dividing polynomials, vol. I, 510

Vidal, 2005, Generalized principal component analysis, IEEE Trans Pattern Anal Mach Intell, 27, 1945, 10.1109/TPAMI.2005.244

Hoerl, 2000, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 42, 80, 10.1080/00401706.2000.10485983

Fan, 2001, Variable selection via nonconcave penalized likelihood and its oracle properties, J Am Stat Assoc, 96, 1348, 10.1198/016214501753382273

Allen, 2011, Sparse non-negative generalized PCA with applications to metabolomics, Bioinformatics, 27, 3029, 10.1093/bioinformatics/btr522

Allen, 2014, A generalized least-square matrix decomposition, J Am Stat Assoc, 109, 145, 10.1080/01621459.2013.852978

Matsen, 2013, Edge principal components and squash clustering: using the special structure of phylogenetic placement data for sample comparison, PLoS One, 8, e56859, 10.1371/journal.pone.0056859

Savorani, 2013, A primer to nutritional metabolomics by NMR spectroscopy and chemometrics, Food Res Int, 54, 1131, 10.1016/j.foodres.2012.12.025

Purdom, 2011, Analysis of a data matrix and a graph: metagenomic data and the phylogenetic tree, Ann Appl Stat, 5, 2326, 10.1214/10-AOAS402

Bik, 2006, Molecular analysis of the bacterial microbiota in the human stomach, Proc Natl Acad Sci USA, 103, 732, 10.1073/pnas.0506655103

Zubin, 1938, A technique for measuring like-mindedness, J Abnorm Soc Psychol, 33, 508, 10.1037/h0055441

Tryon, 1939

Driver, 1932

Bailey, 1975, Cluster analysis, Sociol Methodol, 6, 59, 10.2307/270894

Bridges, 1966, Hierarchical cluster analysis, Psychol Rep, 8, 851, 10.2466/pr0.1966.18.3.851

Kaufman, 1987, Clustering by means of medoids, 405

Kaufman, 1990, Partitioning around medoids (Program PAM), 68

Banfield, 1993, Model-based Gaussian and non-Gaussian clustering, Biometrics, 49, 803, 10.2307/2532201

Ferreira, 2009, A comparison of hierarchical methods for clustering functional data, Commun Stat Simul Comput, 38, 1925, 10.1080/03610910903168603

McQuitty, 1960, Hierarchical linkage analysis for the isolation of types, Educ Psychol Meas, 20, 55, 10.1177/001316446002000106

Sokal, 1963

Blashfield, 1976, Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods, Psychol Bull, 83, 377, 10.1037/0033-2909.83.3.377

Hands, 1987, A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques, Multivar Behav Res, 22, 235, 10.1207/s15327906mbr2202_6

Johnson, 2007

Kuiper, 1975, 391: a Monte Carlo comparison of six clustering procedures, Biometrics, 31, 777, 10.2307/2529565

Milligan, 1980, An examination of the effect of six types of error perturbation on fifteen clustering algorithms, Psychometrika, 45, 325, 10.1007/BF02293907

Shankar, 2015, The networks of human gut microbe-metabolite associations are different between health and irritable bowel syndrome, ISME J, 9, 1899, 10.1038/ismej.2014.258

Sridharan, 2014, Prediction and quantification of bioactive microbiota metabolites in the mouse gut, Nat Commun, 5, 5492, 10.1038/ncomms6492

Gajer, 2012, Temporal dynamics of the human vaginal microbiota, Sci Transl Med, 4, 10.1126/scitranslmed.3003605

Li, 2011, Variation of glucoraphanin metabolism in vivo and ex vivo by human gut bacteria, Br J Nutr, 106, 408, 10.1017/S0007114511000274

Romo-Vaquero, 2019, Deciphering the human gut microbiome of urolithin metabotypes: association with enterotypes and potential cardiometabolic health implications, Mol Nutr Food Res, 63, 1800958, 10.1002/mnfr.201800958

Veiga, 2010, Bifidobacterium animalis subsp. lactis fermented milk product reduces inflammation by altering a niche for colitogenic microbes, Proc Natl Acad Sci USA, 107, 18132, 10.1073/pnas.1011737107

Venkataraman, 2016, Variable responses of human microbiomes to dietary supplementation with resistant starch, Microbiome, 4, 33, 10.1186/s40168-016-0178-x

Rahbar, 2017

Taie, 2018, Clustering of human intestine microbiomes with K-means, 10.1109/NCG.2018.8593154

Kang, 2016, Healthy subjects differentially respond to dietary capsaicin correlating with specific gut enterotypes, J Clin Endocrinol Metabol, 101, 4681, 10.1210/jc.2016-2786

Volokh, 2019, Human gut microbiome response induced by fermented dairy product intake in healthy volunteers, Nutrients, 11, 10.3390/nu11030547

Wu, 2011, Linking long-term dietary patterns with gut microbial enterotypes, Science, 334, 105, 10.1126/science.1208344

Hullar, 2015, Enterolignan-producing phenotypes are associated with increased gut microbial diversity and altered composition in premenopausal women in the United States, Cancer Epidemiol Biomark Prev, 24, 546, 10.1158/1055-9965.EPI-14-0262

Tsivtsivadze, 2013

Strehl, 2003

Imangaliyev, 2015, Personalized microbial network inference via co-regularized spectral clustering, Methods, 83, 28, 10.1016/j.ymeth.2015.03.017

Biesbroek, 2014, Early respiratory microbiota composition determines bacterial succession patterns and respiratory health in children, Am J Respir Crit Care Med, 190, 1283, 10.1164/rccm.201407-1240OC

Borgdorff, 2014, Lactobacillus-dominated cervicovaginal microbiota associated with reduced HIV/STI prevalence and genital HIV viral load in African women, ISME J, 8, 1781, 10.1038/ismej.2014.26

Borgdorff, 2016, Unique insights in the cervicovaginal Lactobacillus iners and L. crispatus proteomes and their associations with microbiota dysbiosis, PLoS One, 11, 10.1371/journal.pone.0150767

Kootte, 2017, Improvement of insulin sensitivity after lean donor feces in metabolic syndrome is driven by baseline intestinal microbiota composition, Cell Metab, 26, 611, 10.1016/j.cmet.2017.09.008

Botschuijver, 2018, Reversal of visceral hypersensitivity in rat by Menthacarin®, a proprietary combination of essential oils from peppermint and caraway, coincides with mycobiome modulation, Neurogastroenterol Motil, 30, 10.1111/nmo.13299

Sun, 2012, A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis, Brief Bioinform, 13, 107, 10.1093/bib/bbr009

Vinh, 2010

Ghodsi, 2011, DNACLUST: accurate and efficient clustering of phylogenetic marker genes, BMC Bioinf, 12, 271, 10.1186/1471-2105-12-271

Sun, 2009, ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences, Nucleic Acids Res, 37, e76, 10.1093/nar/gkp285

Cai, 2011, ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time, Nucleic Acids Res, 39, e95, 10.1093/nar/gkr349

Flynn, 2015, Toward accurate molecular identification of species in complex environmental samples: testing the performance of sequence filtering and clustering methods, Ecol Evol, 5, 2252, 10.1002/ece3.1497

Mao, 2015, Parallel hierarchical clustering in linearithmic time for large-scale sequence analysis, 10.1109/ICDM.2015.90

Schmidt, 2015, Limits to robustness and reproducibility in the demarcation of operational taxonomic units, Environ Microbiol, 17, 1689, 10.1111/1462-2920.12610

Zheng, 2018, A parallel computational framework for ultra-large-scale sequence clustering analysis, Bioinformatics, 35, 380, 10.1093/bioinformatics/bty617

Wei, 2015, MtHc: a motif-based hierarchical method for clustering massive 16S rRNA sequences into OTUs, Mol Biosyst, 11, 1907, 10.1039/C5MB00089K

Wei, 2017, DBH: a de Bruijn graph-based heuristic method for clustering large-scale 16S rRNA sequences into OTUs, J Theor Biol, 425, 80, 10.1016/j.jtbi.2017.04.019

Wei, 2017, DMclust, a density-based modularity method for accurate OTU picking of 16S rRNA sequences, Mol Inf, 36, 1600059, 10.1002/minf.201600059

Cai, 2017, ESPRIT-Forest: parallel clustering of massive amplicon sequence data in subquadratic time, PLoS Comput Biol, 13, 10.1371/journal.pcbi.1005518

Wei, 2019, DMSC: a dynamic multi-seeds method for clustering 16S rRNA sequences into OTUs, Front Microbiol, 10, 428, 10.3389/fmicb.2019.00428

Claesson, 2017, A clinician's guide to microbiome analysis, Nat Rev Gastroenterol Hepatol, 14, 585, 10.1038/nrgastro.2017.97

Czaja, 2016, Factoring the intestinal microbiome into the pathogenesis of autoimmune hepatitis, World J Gastroenterol, 22, 9257, 10.3748/wjg.v22.i42.9257

Humphries, 2018, The gut microbiota and immune checkpoint inhibitors, Hum Vaccin Immunother, 14, 2178, 10.1080/21645515.2018.1442970

Asgari, 2018, MicroPheno: predicting environments and host phenotypes from 16S rRNA gene sequencing using a k-mer based representation of shallow sub-samples, Bioinformatics, 34, i32, 10.1093/bioinformatics/bty296

Zheng, 2018, SENSE: siamese neural network for sequence embedding and alignment-free comparison, Bioinformatics, 35, 1820, 10.1093/bioinformatics/bty887

Hao, 2011, Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering, Bioinformatics (Oxford, England), 27, 611, 10.1093/bioinformatics/btq725

Feng, 2019, Accurate prediction of neoadjuvant chemotherapy pathological complete remission (pCR) for the four sub-types of breast cancer, IEEE Access, 7, 134697, 10.1109/ACCESS.2019.2941543

Mesuere, 2012, Unipept: tryptic peptide-based biodiversity analysis of metaproteome samples, J Proteome Res, 11, 5773, 10.1021/pr300576s

Muth, 2015, The MetaProteomeAnalyzer: a powerful open-source software suite for metaproteomics data analysis and interpretation, J Proteome Res, 14, 1557, 10.1021/pr501246w

Sinkko, 2011, Phosphorus chemistry and bacterial community composition interact in brackish sediments receiving agricultural discharges, PLoS One, 6, 10.1371/journal.pone.0021555

Ye, 2010, Multivariate analysis of chemical and microbial properties in histosols as influenced by land-use types, Soil Tillage Res, 110, 94, 10.1016/j.still.2010.06.013

Wang, 2012, Multivariate approach for studying interactions between environmental variables and microbial communities, PLoS One, 7, 10.1371/journal.pone.0050267

Rodriguez-Valera, 2004, Environmental genomics, the big picture?, FEMS Microbiol Lett, 231, 153, 10.1016/S0378-1097(04)00006-0

Zhang, 2018, Joint principal trend analysis for longitudinal high-dimensional data, Biometrics, 74, 430, 10.1111/biom.12751

Tofallis, 1999, Model building with multiple dependent variables and constraints, J R Stat Soc Ser D, 48, 371, 10.1111/1467-9884.00195

Gygi, 1999, Correlation between protein and mRNA abundance in yeast, Mol Cell Biol, 19, 1720, 10.1128/MCB.19.3.1720

Parkhomenko, 2007, Genome-wide sparse canonical correlation of gene expression with genotypes, BMC Proc, 1, S119, 10.1186/1753-6561-1-S1-S119

Suo, 2017

Waaijenborg, 2008, Quantifying the association between gene expressions and DNA-markers by penalized canonical correlation analysis, Stat Appl Genet Mol Biol, 7, 10.2202/1544-6115.1329

Witten, 2009, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, 10, 515, 10.1093/biostatistics/kxp008

Witten, 2009, Extensions of sparse canonical correlation analysis with applications to genomic data, Stat Appl Genet Mol Biol, 8, 10.2202/1544-6115.1470

Solari, 2019

Witten, 2019

Abraham, 2014, Fast principal component analysis of large-scale genome-wide data, PLoS One, 9, 10.1371/journal.pone.0093766

Alam, 2008, Sensitivity analysis in robust and kernel canonical correlation analysis, 10.1109/ICCITECHN.2008.4802966

Blaschko, 2008

Van Gestel, 2001

Akaho, 2001, A kernel method for canonical correlation analysis, 4

Akaho, 2006

Lai, 2000, Kernel and nonlinear canonical correlation analysis, Int J Neural Syst, 10, 365, 10.1142/S012906570000034X

Melzer, 2001

Hardoon, 2004, Canonical correlation analysis: an overview with application to learning methods, Neural Comput, 16, 2639, 10.1162/0899766042321814

Larson, 2014, Kernel canonical correlation analysis for assessing gene-gene interactions and application to ovarian cancer, Eur J Hum Genet, 22, 126, 10.1038/ejhg.2013.69

Bie, 2003

Lai, 1999, A neural implementation of canonical correlation analysis, Neural Netw, 12, 1391, 10.1016/S0893-6080(99)00075-1

Andrew, 2013, Deep canonical correlation analysis, vol. 28, 1247

Ramsay, J. O. a. S., B. W., 2005

Ravikumar, 2009, Sparse additive models, J R Stat Soc Series B Stat Methodology, 71, 1009, 10.1111/j.1467-9868.2009.00718.x

Balakrishnan, 2012, Sparse additive functional and kernel CCA, vol. 1

Dolédec, 1994, Co-inertia analysis: an alternative method for studying species–environment relationships, Freshw Biol, 31, 277, 10.1111/j.1365-2427.1994.tb01741.x

Thioulouse, 2011, Simultaneous analysis of a sequence of paired ecological tables: a comparison of several methods, Ann Appl Stat, 5, 2300, 10.1214/10-AOAS372

Dray, 2003, Co-Inertia analysis and the linking of ecological data tables, Ecology, 84, 3078, 10.1890/03-0178

Zhang, 2019, Statistical evaluation of diet-microbe associations, BMC Microbiol, 19, 90, 10.1186/s12866-019-1464-0

Bady, 2004, Multiple co-inertia analysis: a tool for assessing synchrony in the temporal variability of aquatic communities, C R Biol, 327, 29, 10.1016/j.crvi.2003.10.007

Hanafi, 2011, Connections between multiple co-inertia analysis and consensus principal component analysis, Chemom Intel Lab Syst, 106, 37, 10.1016/j.chemolab.2010.05.010

Claesson, 2012, Gut microbiota composition correlates with diet and health in the elderly, Nature, 488, 178, 10.1038/nature11319

Liu, 2017, Gut microbiome and serum metabolome alterations in obesity and after weight-loss intervention, Nat Med, 23, 859, 10.1038/nm.4358

Jovanović, 2014, The co-inertia approach in identification of specific microRNA in early and advanced atherosclerosis plaque, Med Hypotheses, 83, 11, 10.1016/j.mehy.2014.04.019

Raimondi, 2009, Bioconversion of soy isoflavones daidzin and daidzein by Bifidobacterium strains, Appl Microbiol Biotechnol, 81, 943, 10.1007/s00253-008-1719-4

Gao, 2019, In vitro digestion and fermentation of three polysaccharide fractions from Laminaria japonica and their impact on lipid metabolism-associated human gut microbiota, J Agric Food Chem, 67, 7496, 10.1021/acs.jafc.9b00970

Tap, 2015, Gut microbiota richness promotes its stability upon increased dietary fibre intake in healthy adults, Environ Microbiol, 17, 4954, 10.1111/1462-2920.13006

Min, 2018, Penalized co-inertia analysis with applications to -omics data, Bioinformatics, 35, 1018, 10.1093/bioinformatics/bty726

Hurley, 1962, The Procrustes Program: producing direct rotation to test a hypothesized factor structure, Behav Sci, 7, 258, 10.1002/bs.3830070216

Quinn, 2016, From sample to multi-omics conclusions in under 48 hours, mSystems, 1, 10.1128/mSystems.00038-16

Chen, 2017, Fiber-utilizing capacity varies in Prevotella- versus Bacteroides-dominated gut microbiota, Sci Rep, 7, 2594, 10.1038/s41598-017-02995-4

Shankar, 2013, Do gut microbial communities differ in pediatric IBS and health?, Gut microbes, 4, 347, 10.4161/gmic.24827

Smits, 2016, Individualized responses of gut microbiota to dietary intervention modeled in humanized mice, mSystems, 1, 10.1128/mSystems.00098-16

Rajilic-Stojanovic, 2010, Evaluating the microbial diversity of an in vitro model of the human large intestine by phylogenetic microarray analysis, Microbiology, 156, 3270, 10.1099/mic.0.042044-0

Ringel-Kulka, 2013, Intestinal microbiota in healthy U.S. young children and adults—a high throughput microarray analysis, PLoS One, 8, 10.1371/journal.pone.0064315

Zhang, 2012, Structural resilience of the gut microbiota in adult mice under high-fat dietary perturbations, ISME J, 6, 1848, 10.1038/ismej.2012.27

Wilmes, 2004, The application of two-dimensional polyacrylamide gel electrophoresis and downstream analyses to a mixed community of prokaryotic microorganisms, Environ Microbiol, 6, 911, 10.1111/j.1462-2920.2004.00687.x

Ram, 2005, Community proteomics of a natural microbial biofilm, Science, 308, 1915, 10.1126/science. 1109070

Akorli, 2016, Seasonality and locality affect the diversity of anopheles gambiae and anopheles coluzzii midgut microbiota from Ghana, PLoS One, 11, 10.1371/journal.pone.0157529

Dinleyici, 2018, Time series analysis of the microbiota of children suffering from acute infectious diarrhea and their recovery after treatment, Front Microbiol, 9, 1230, 10.3389/fmicb.2018.01230

Nie, 2017, Unraveling the correlation between microbiota succession and metabolite changes in traditional Shanxi aged vinegar, Sci Rep, 7, 9240, 10.1038/s41598-017-09850-6

Gower, 1989, Generalized canonical analysis, 221

Kettenring, 1971, Canonical analysis of several sets of variables, Biometrika, 58, 433, 10.1093/biomet/58.3.433

Carroll, 1968, Generalization of canonical correlation analysis to three or more sets of variables, 10.1037/e473742008-115

Jun, 2018, Multi-block analysis of genomic data using generalized canonical correlation analysis, Genome Inform, 16, e33, 10.5808/GI.2018.16.4.e33

Chessel, 1996, Analysis of the co-inertia of K tables Analyses de la co-inertie de K nuages de points, Rev Stat Appl, 44, 35

Wold, 1987

Smilde, 2003, A framework for sequential multiblock component methods, J Chemometr, 17, 323, 10.1002/cem.811

Westerhuis, 1998, Analysis of multiblock and hierarchical PCA and PLS models, J Chemometr, 12, 301, 10.1002/(SICI)1099-128X(199809/10)12:5<301::AID-CEM515>3.0.CO;2-S

Rafii, 2015, The role of colonic bacteria in the metabolism of the natural isoflavone daidzin to equol, Metabolites, 5, 56, 10.3390/metabo5010056

Tenenhaus, 2014, Variable selection for generalized canonical correlation analysis, Biostatistics, 15, 569, 10.1093/biostatistics/kxu001

Setchell, 2013, Dietary factors influence production of the soy isoflavone metabolite s-(−)equol in healthy adults, J Nutr, 143, 1950, 10.3945/jn.113.179564

Liu, 2010, Prevalence of the equol-producer phenotype and its relationship with dietary isoflavone and serum lipids in healthy Chinese adults, J Epidemiol, 20, 377, 10.2188/jea.JE20090185

Xu, 1994, Daidzein is a more bioavailable soymilk isoflavone than is genistein in adult women, J Nutr, 124, 825, 10.1093/jn/124.6.825

Garali, 2017, A strategy for multimodal data integration: application to biomarkers identification in spinocerebellar ataxia, Brief Bioinform, 19, 1356, 10.1093/bib/bbx060

Wold, 1966, Estimation of principal components and related models by iterative least squares, 391

Abdi, 2010, Partial least squares regression and projection on latent structure regression (PLS Regression), WIREs Comput Stat, 2, 97, 10.1002/wics.51

Tobias, 1995, An introduction to partial least squares regression

Trygg, 2003, O2-PLS, a two-block (X–Y) latent variable regression (LVR) method with an integral OSC filter, J Chemometr, 17, 53, 10.1002/cem.775

Brereton, 2014, A short history of chemometrics: a personal view, J Chemometr, 28, 749, 10.1002/cem.2633

Dao, 2018, A data integration multi-omics approach to study calorie restriction-induced changes in insulin sensitivity, Front Physiol, 9, 1958, 10.3389/fphys.2018.01958

Chun, 2010, Sparse partial least squares regression for simultaneous dimension reduction and variable selection, J R Stat Soc Ser B Stat Methodol, 72, 3, 10.1111/j.1467-9868.2009.00723.x

Chung, 2010, Sparse partial least squares classification for high dimensional data, Stat Appl Genet Mol Biol, 9, 17, 10.2202/1544-6115.1492

Lê Cao, 2008, A sparse PLS for variable selection when integrating omics data, Stat Appl Genet Mol Biol, 7, 10.2202/1544-6115.1390

Trygg, 2002, O2-PLS for qualitative and quantitative analysis in multivariate calibration, J Chemometr, 16, 283, 10.1002/cem.724

Bylesjö, 2007, Data integration in plant biology: the O2PLS method for combined modeling of transcript and metabolite data, Plant J, 52, 1181, 10.1111/j.1365-313X.2007.03293.x

Cloarec, 2005, Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets, Anal Chem, 77, 1282, 10.1021/ac048630x

Cloarec, 2005, Evaluation of the orthogonal projection on latent structure model limitations caused by chemical shift variability and improved visualization of biomarker changes in 1H NMR spectroscopic metabonomic studies, Anal Chem, 77, 517, 10.1021/ac048803i

El Aidy, 2013, Gut bacteria–host metabolic interplay during conventionalisation of the mouse germfree colon, ISME J, 7, 743, 10.1038/ismej.2012.142

Bylesjö, 2008, K-OPLS package: kernel-based orthogonal projections to latent structures for prediction and interpretation in feature space, BMC Bioinf, 9, 106, 10.1186/1471-2105-9-106

Härdle, 2019, Discriminant analysis, 395

Izenman, 2008, Linear discriminant analysis, 237

Werner, 2011, Bacterial community structures are unique and resilient in full-scale bioenergy systems, Proc Natl Acad Sci USA, 108, 4158, 10.1073/pnas.1015676108

Kruskal, 1952, Use of ranks in one-criterion variance analysis, J Am Stat Assoc, 47, 583, 10.1080/01621459.1952.10483441

Blankenberg, 2010, Galaxy: a web-based genome analysis tool for experimentalists, Curr Protoc Mol Biol, 10.1002/0471142727.mb1910s89

Goecks, 2010, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences, Genome Biol, 11, 10.1186/gb-2010-11-8-r86

Wolf, 2017, The salivary microbiome as an indicator of carcinogenesis in patients with oropharyngeal squamous cell carcinoma: a pilot study, Sci Rep, 7, 5867, 10.1038/s41598-017-06361-2

Puri, 2018, The circulating microbiome signature and inferred functional metagenomics in alcoholic hepatitis, Hepatology, 67, 1284, 10.1002/hep.29623

Thomas, 2019, Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation, Nat Med, 25, 667, 10.1038/s41591-019-0405-7

Chumpitazi, 2015, Randomised clinical trial: gut microbiome biomarkers are associated with clinical response to a low FODMAP diet in children with the irritable bowel syndrome, Aliment Pharmacol Ther, 42, 418, 10.1111/apt.13286

Barker, 2003, Partial least squares for discrimination, J Chemometr, 17, 166, 10.1002/cem.785

Christin, 2013, A critical assessment of feature selection methods for biomarker discovery in clinical proteomics, Mol Cell Proteomics, 12, 263, 10.1074/mcp.M112.022566

Nguyen, 2002, Tumor classification by partial least squares using microarray gene expression data, Bioinformatics, 18, 39, 10.1093/bioinformatics/18.1.39

Tan, 2004, Multi-class tumor classification by discriminant partial least squares using microarray gene expression data and assessment of classification models, Comput Biol Chem, 28, 235, 10.1016/j.compbiolchem.2004.05.002

Rohart, 2017, mixOmics: an R package for ‘omics feature selection and multiple data integration, PLoS Comput Biol, 13, e1005752, 10.1371/journal.pcbi.1005752

Worley, 2013, Utilities for quantifying separation in PCA/PLS-DA scores plots, Anal Biochem, 433, 102, 10.1016/j.ab.2012.10.011

Gomez-Alvarez, 2012, Metagenome analyses of corroded concrete wastewater pipe biofilms reveal a complex microbial system, BMC Microbiol, 12, 122, 10.1186/1471-2180-12-122

Worley, 2016, PCA as a practical indicator of OPLS-DA model reliability, Curr Metabolomics, 4, 97, 10.2174/2213235X04666160613122429

Stenlund, 2008, Orthogonal projections to latent structures discriminant analysis modeling on in situ FT-IR spectral imaging of liver tissue for identifying sources of variability, Anal Chem, 80, 6898, 10.1021/ac8005318

Bocca, 2018, A plasma metabolomic signature involving purine metabolism in human optic atrophy 1 (OPA1)-related disorders, Invest Ophthalmol Vis Sci, 59, 185, 10.1167/iovs.17-23027

Bennet, 2018, Multivariate modelling of faecal bacterial profiles of patients with IBS predicts responsiveness to a diet low in FODMAPs, Gut, 67, 872, 10.1136/gutjnl-2016-313128

Ramadan, 2014, Fecal microbiota of cats with naturally occurring chronic diarrhea assessed using 16S rRNA gene 454-pyrosequencing before and after dietary treatment, J Vet Intern Med, 28, 59, 10.1111/jvim.12261

Hastie, 2009

James, 2013

Loh, 2011, Classification and regression trees, WIREs Data Min Knowl Discovery, 1, 14, 10.1002/widm.8

Liaw, 2002, Classification and regression by randomForest, R News, 2, 18

Cutler, 2007, Random forests for classification in ecology, Ecology, 88, 2783, 10.1890/07-0539.1

Knights, 2011, Supervised classification of human microbiota, FEMS Microbiol Rev, 35, 343, 10.1111/j.1574-6976.2010.00251.x

Gashler, 2008, Decision tree ensemble: small heterogeneous is better than large homogeneous, 10.1109/ICMLA.2008.154

Griffin, 2017, Prior dietary practices and connections to a human gut microbial metacommunity alter responses to diet interventions, Cell Host Microbe, 21, 84, 10.1016/j.chom.2016.12.006

Lozupone, 2013, Alterations in the gut microbiota associated with HIV-1 infection, Cell Host Microbe, 14, 329, 10.1016/j.chom.2013.08.006

Piening, 2018, Integrative personal omics profiles during periods of weight gain and loss, Cell Syst, 6, 157, 10.1016/j.cels.2017.12.013

Beck, 2014, Machine learning techniques accurately classify microbial communities by bacterial vaginosis characteristics, PLoS One, 9, e87830, 10.1371/journal.pone.0087830

Chatterjee, 2020, Vitamin D receptor promotes healthy microbial metabolites and microbiome, Sci Rep, 10.1038/s41598-020-64226-7

Papa, 2012, Non-invasive mapping of the gastrointestinal microbiota identifies children with inflammatory bowel disease, PLoS One, 7, e39242, 10.1371/journal.pone.0039242

Cortes, 1995, Support-vector networks, Mach Learn, 20, 273, 10.1007/BF00994018

Gu, 2010, Discriminant analysis via support vectors, Neurocomputing, 73, 1669, 10.1016/j.neucom.2009.09.021

Xiao, 2018, Predictive modeling of microbiome data using a phylogeny-regularized generalized linear mixed model, Front Microbiol, 9, 1391, 10.3389/fmicb.2018.01391

Yang, 2006, An ecoinformatics tool for microbial community studies: supervised classification of Amplicon Length Heterogeneity (ALH) profiles of 16S rRNA, J Microbiol Methods, 65, 49, 10.1016/j.mimet.2005.06.012

Kolho, 2015, Fecal microbiota in pediatric inflammatory bowel disease and its relation to inflammation, Am J Gastroenterol, 110, 921, 10.1038/ajg.2015.149

Parks, 2013, Genetic control of obesity and gut microbiota composition in response to high-fat, high-sucrose diet in mice, Cell Metab, 17, 141, 10.1016/j.cmet.2012.12.007

Salonen, 2014, Impact of diet and individual variation on intestinal microbiota composition and fermentation products in obese men, ISME J, 8, 2218, 10.1038/ismej.2014.63

Furlotte, 2011, Mixed-model coexpression: calculating gene coexpression while accounting for expression heterogeneity, Bioinformatics (Oxford, England), 27, i288, 10.1093/bioinformatics/btr221

Zhao, 2015, Integrative analysis of “-omics” data using penalty functions, Wiley Interdiscip Rev Comput Stat, 7, 99, 10.1002/wics.1322

Xia, 2018, Univariate community analysis, 251

Mankiewicz, 2004

Ruxton, 2006, The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test, Behav Ecol, 17, 688, 10.1093/beheco/ark016

Ciaccio, 2015, Home dust microbiota is disordered in homes of low-income asthmatic children, J Asthma, 52, 873, 10.3109/02770903.2015.1028076

Kononikhin, 2019, Proteome profiling of the exhaled breath condensate after long-term spaceflights, Int J Mol Sci, 20, 4518, 10.3390/ijms20184518

Kourosh, 2018, Fecal microbiome signatures are different in food-allergic children compared to siblings and healthy children, Pediatr Allergy Immunol, 29, 545, 10.1111/pai.12904

Kruskal, 1957, Historical notes on the Wilcoxon unpaired two-sample test, J Am Stat Assoc, 52, 356, 10.1080/01621459.1957.10501395

Falony, 2016, Population-level analysis of gut microbiome variation, Science, 352, 560, 10.1126/science.aad3503

Kovatcheva-Datchary, 2015, Dietary fiber-induced improvement in glucose metabolism is associated with increased abundance of prevotella, Cell Metab, 22, 971, 10.1016/j.cmet.2015.10.001

Roager, 2014, Microbial enterotypes, inferred by the prevotella-to-bacteroides ratio, remained stable during a 6-month randomized controlled diet intervention with the new nordic diet, Appl Environ Microbiol, 80, 1142, 10.1128/AEM.03549-13

Suez, 2014, Artificial sweeteners induce glucose intolerance by altering the gut microbiota, Nature, 514, 181, 10.1038/nature13793

Zhao, 2018, Gut bacteria selectively promoted by dietary fibers alleviate type 2 diabetes, Science, 359, 1151, 10.1126/science.aao5774

Bouhnik, 2004, The capacity of nondigestible carbohydrates to stimulate fecal bifidobacteria in healthy humans: a double-blind, randomized, placebo-controlled, parallel-group, dose-response relation study, Am J Clin Nutr, 80, 1658, 10.1093/ajcn/80.6.1658

Santacruz, 2009, Interplay between weight loss and gut microbiota composition in overweight adolescents, Obesity, 17, 1906, 10.1038/oby.2009.112

Fisher, 1918, The correlation between relatives on the supposition of mendelian inheritance, Earth Environ Sci Trans R Soc Edinb, 52, 399, 10.1017/S0080456800012163

Dao, 2016, Akkermansia muciniphila and improved metabolic health during a dietary intervention in obesity: relationship with gut microbiome richness and ecology, Gut, 65, 426, 10.1136/gutjnl-2014-308778

Mobini, 2017, Metabolic effects of Lactobacillus reuteri DSM 17938 in people with type 2 diabetes: a randomized controlled trial, Diabetes Obes Metab, 19, 579, 10.1111/dom.12861

Possemiers, 2007, Metabolism of isoflavones, lignans and prenylflavonoids by intestinal bacteria: producer phenotyping and relation with intestinal community, FEMS Microbiol Ecol, 61, 372, 10.1111/j.1574-6941.2007.00330.x

Liss, 2019, Microbiome diversity in carriers of fluoroquinolone resistant Escherichia coli, Investig Clin Urol, 60, 75, 10.4111/icu.2019.60.2.75

McArdle, 2001, Fitting multivariate models to community data: a comment on distance-based redundancy analysis, Ecology, 82, 290, 10.1890/0012-9658(2001)082[0290:FMMTCD]2.0.CO;2

Bhattacharya, 2014, Effect of bacteria on the wound healing behavior of oral epithelial cells, PLoS One, 9, e89475, 10.1371/journal.pone.0089475

Koh, 2018, An adaptive microbiome α-diversity-based association analysis method, Sci Rep, 8, 18026, 10.1038/s41598-018-36355-7

Radhakrishna Rao, 1948, Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation, Math Proc Camb Philos Soc, 44, 50, 10.1017/S0305004100023987

Allen, 2009, A new phylogenetic diversity measure generalizing the shannon index and its application to phyllostomid bats, Am Nat, 174, 236, 10.1086/600101

Rao, 1982, Diversity and dissimilarity coefficients: a unified approach, Theor Popul Biol, 21, 24, 10.1016/0040-5809(82)90004-1

Warwick, 1995, New ‘biodiversity' measures reveal a decrease in taxonomic distinctness with increasing stress, Mar Ecol Prog Ser, 129, 301, 10.3354/meps129301

Koh, 2018, A highly adaptive microbiome-based association test for survival traits, BMC Genomics, 19, 210, 10.1186/s12864-018-4599-8

Pan, 2014, A powerful and adaptive association test for rare variants, Genetics, 197, 1081, 10.1534/genetics.114.165035

Koh, 2019, A distance-based kernel association test based on the generalized linear mixed model for correlated microbiome studies, Front Genet, 10, 458, 10.3389/fgene.2019.00458

Zhan, 2019, Relationship between MiRKAT and coefficient of determination in similarity matrix regression, Processes, 7, 79, 10.3390/pr7020079

Mantel, 1970, A technique of nonparametric multivariate analysis, Biometrics, 26, 547, 10.2307/2529108

Lisboa, 2014, Much beyond Mantel: bringing Procrustes association metric to the plant and soil ecologist's toolbox, PLoS One, 9, 10.1371/journal.pone.0101238

Zhou, 2017, Relationship between gingival crevicular fluid microbiota and cytokine profile in periodontal host homeostasis, Front Microbiol, 8, 2144, 10.3389/fmicb.2017.02144

Zhu, 2018, Antibiotics disturb the microbiome and increase the incidence of resistance genes in the gut of a common soil collembolan, Environ Sci Technol, 52, 3081, 10.1021/acs.est.7b04292

Kakumanu, 2016, Honey bee gut microbiome is altered by in-hive pesticide exposures, Front Microbiol, 7, 1255, 10.3389/fmicb.2016.01255

Li, 2019, Dysbiosis of lower respiratory tract microbiome are associated with inflammation and microbial function variety, Respir Res, 20, 272, 10.1186/s12931-019-1246-0

Marsilio, 2019, Characterization of the fecal microbiome in cats with inflammatory bowel disease or alimentary small cell lymphoma, Sci Rep, 9, 19208, 10.1038/s41598-019-55691-w

Mielke, 1984, 34 Meteorological applications of permutation techniques based on distance functions, vol. 4, 813, 10.1016/S0169-7161(84)04036-0

Warton, 2012, Distance-based multivariate analyses confound location and dispersion effects, Methods Ecol Evol, 3, 89, 10.1111/j.2041-210X.2011.00127.x

Mielke, 2007

McCune, 2002

Falk, 2013, Partial bioaugmentation to remove 3-chloroaniline slows bacterial species turnover rate in bioreactors, Water Res, 47, 7109, 10.1016/j.watres.2013.08.040

Morissette, 2018, Growth performance of piglets during the first two weeks of lactation affects the development of the intestinal microbiota, J Anim Physiol Anim Nutr (Berl), 102, 525, 10.1111/jpn.12784

Reese, 2018, Drivers of microbiome biodiversity: a review of general rules, feces, and ignorance, mBio, 9, 10.1128/mBio.01294-18

Bacon-Shone, 2008, Discrete and continuous compositions

Anders, 2013, Count-based differential expression analysis of RNA sequencing data using R and bioconductor, Nat Protoc, 8, 1765, 10.1038/nprot.2013.099

Kuczynski, 2012, Experimental and analytical tools for studying the human microbiome, Nat Rev Genet, 13, 47, 10.1038/nrg3129

Xu, 2015, Assessment and selection of competing models for zero-inflated microbiome data, PLoS One, 10, 10.1371/journal.pone.0129606

Feng, 2015, Some theoretical comparisons of negative binomial and zero-inflated poisson distributions, Commun Stat Theory Methods, 44, 3266, 10.1080/03610926.2013.823203

Mosimann, 1962, On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions, Biometrika, 49, 65, 10.2307/2333468

Mosimann, 1963, On the compound negative multinomial distribution and correlations among inversely sampled pollen counts, Biometrika, 50, 47, 10.2307/2333745

Holmes, 2012, Dirichlet multinomial mixtures: generative models for microbial metagenomics, PLoS One, 7, 10.1371/journal.pone.0030126

Chen, 2013, Variable selection for sparse Dirichlet-multinomial regression with an application to microbiome data analysis, Ann Appl Stat, 7, 418, 10.1214/12-AOAS592

Wang, 2017, Constructing predictive microbial signatures at multiple taxonomic levels, J Am Stat Assoc, 112, 1022, 10.1080/01621459.2016.1270213

Wang, 2017, A Dirichlet-tree multinomial regression model for associating dietary nutrients with gut microorganisms, Biometrics, 73, 792, 10.1111/biom.12654

Sankaran, 2018, Latent variable modeling for the microbiome, Biostatistics, 20, 599, 10.1093/biostatistics/kxy018

Shi, 2017, A model for paired-multinomial data and its application to analysis of data on a taxonomic tree, Biometrics, 73, 1266, 10.1111/biom.12681

Tang, 2018, Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis, Biostatistics, 20, 698, 10.1093/biostatistics/kxy025

Xia, 2013, A logistic normal multinomial regression model for microbiome compositional data analysis, Biometrics, 69, 1053, 10.1111/biom.12079

Nowicka, 2016, DRIMSeq: a Dirichlet-multinomial framework for multivariate count outcomes in genomics, F1000Res, 5, 1356, 10.12688/f1000research.8900.2

Harrison, 2020, Dirichlet-multinomial modelling outperforms alternatives for analysis of microbiome and other ecological count data, Mol Ecol Resour, 20, 481, 10.1111/1755-0998.13128

Wang, 2020, Estimating and testing the microbial causal mediation effect with high-dimensional and compositional microbiome data, Bioinformatics, 36, 347, 10.1093/bioinformatics/btz565

Bouguila, 2011, Count data modeling and classification using finite mixtures of distributions, IEEE Trans Neural Netw, 22, 186, 10.1109/TNN.2010.2091428

Ye, 2010, Compositional adjustment of Dirichlet mixture priors, J Comput Biol, 17, 1607, 10.1089/cmb.2010.0117

Song, 2019, An adaptive independence test for microbiome community data, Biometrics, 10.1111/biom.13154

Chu, 2017, Maturation of the infant microbiome community structure and function across multiple body sites and in relation to mode of delivery, Nat Med, 23, 314, 10.1038/nm.4272

Vandeputte, 2016, Stool consistency is strongly associated with gut microbiota richness and composition, enterotypes and bacterial growth rates, Gut, 65, 57, 10.1136/gutjnl-2015-309618

Lin, 2014, Variable selection in regression with compositional covariates, Biometrika, 101, 785, 10.1093/biomet/asu031

Tang, 2018, A phylogenetic scan test on a Dirichlet-tree multinomial model for microbiome data, Ann Appl Stat, 12, 1, 10.1214/17-AOAS1086

Dennis, 1991, On the hyper-dirichlet type 1 and hyper-liouville distributions, Commun Stat Theory Methods, 20, 4069, 10.1080/03610929108830757

Bradley, 2018, Phylogeny-corrected identification of microbial gene families relevant to human gut colonization, PLoS Comput Biol, 14, 10.1371/journal.pcbi.1006242

Connor, 1969, Concepts of independence for proportions with a generalization of the dirichlet distribution, J Am Stat Assoc, 64, 194, 10.1080/01621459.1969.10500963

Tang, 2019, Multi-omic analysis of the microbiome and metabolome in healthy subjects reveals microbiome-dependent relationships between diet and metabolites, Front Genet, 10, 454, 10.3389/fgene.2019.00454

Tang, 2017

Mao, 2019, Bayesian graphical compositional regression for microbiome data, J Am Stat Assoc, 1, 10.1080/01621459.2019.1647212

Yang, 2017, Inference of environmental factor-microbe and microbe-microbe associations from metagenomic data using a hierarchical Bayesian statistical model, Cell Syst, 4, 129, 10.1016/j.cels.2016.12.012

Tackmann, 2019, Rapid inference of direct interactions in large-scale ecological networks from heterogeneous microbial sequencing data, Cell Syst, 9, 286, 10.1016/j.cels.2019.08.002

Yuan, 2019, Compositional data network analysis via lasso penalized D-trace loss, Bioinformatics, 35, 3404, 10.1093/bioinformatics/btz098

Liu, 2018, Comprehensive simulation of metagenomic sequencing data with non-uniform sampling distribution, Quant Biol, 6, 175, 10.1007/s40484-018-0142-9

Wong, 2019, Gut microbiota in colorectal cancer: mechanisms of action and clinical applications, Nat Rev Gastroenterol Hepatol, 16, 690, 10.1038/s41575-019-0209-8

Larson, 2019, A review of kernel methods for genetic association studies, Genet Epidemiol, 43, 122, 10.1002/gepi.22180

Lee, 2012, Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies, Am J Hum Genet, 91, 224, 10.1016/j.ajhg.2012.06.007

Li, 2012, Gene-centric gene–gene interaction: a model-based kernel machine method, Ann Appl Stat, 6, 1134, 10.1214/12-AOAS545

Lin, 2013, Test for interactions between a genetic marker set and environment in generalized linear models, Biostatistics (Oxford, England), 14, 667, 10.1093/biostatistics/kxt006

Schaid, 2013, Multiple genetic variant association testing by collapsing and kernel methods with pedigree or population structured data, Genet Epidemiol, 37, 409, 10.1002/gepi.21727

Choi, 2014, FARVAT: a family-based rare variant association test, Bioinformatics, 30, 3197, 10.1093/bioinformatics/btu496

Saad, 2014, Combining family- and population-based imputation data for association analysis of rare and common variants in large pedigrees, Genet Epidemiol, 38, 579, 10.1002/gepi.21844

Wang, 2016, Boosting the power of the sequence kernel association test by properly estimating its null distribution, Am J Hum Genet, 99, 104, 10.1016/j.ajhg.2016.05.011

Wu, 2016, Sequence kernel association test of multiple continuous phenotypes, Genet Epidemiol, 40, 91, 10.1002/gepi.21945

Schweiger, 2017, RL-SKAT: an exact and efficient score test for heritability and set tests, Genetics, 207, 1275, 10.1534/genetics.117.300395

Chen, 2016, Small sample kernel association tests for human genetic and microbiome association studies, Genet Epidemiol, 40, 5, 10.1002/gepi.21934

Zhan, 2017, A small-sample multivariate kernel machine test for microbiome association studies, Genet Epidemiol, 41, 210, 10.1002/gepi.22030

Zhan, 2018, A small-sample kernel association test for correlated data with application to microbiome association studies, Genet Epidemiol, 42, 772, 10.1002/gepi.22160

Lumley, 2018, FastSKAT: sequence kernel association tests for very large sets of markers, Genet Epidemiol, 42, 516, 10.1002/gepi.22136

Yan, 2018, KMgene: a unified R package for gene-based association analysis for complex traits, Bioinformatics (Oxford, England), 34, 2144, 10.1093/bioinformatics/bty066

Plantinga, 2017, MiRKAT-S: a community-level test of association between the microbiota and survival times, Microbiome, 5, 17, 10.1186/s40168-017-0239-9

Tang, 2016, PERMANOVA-S: association test for microbial community composition that accommodates confounders and multiple distances, Bioinformatics (Oxford, England), 32, 2618, 10.1093/bioinformatics/btw311

Benjamini, 2010, Discovering the false discovery rate, J R Stat Soc Series B Stat Methodol, 72, 405, 10.1111/j.1467-9868.2010.00746.x

Parks, 2014, STAMP: statistical analysis of taxonomic and functional profiles, Bioinformatics (Oxford, England), 30, 3123, 10.1093/bioinformatics/btu494

Sun, 2017

Hu, 2018, A two-stage microbial association mapping framework with advanced FDR control, Microbiome, 6, 131, 10.1186/s40168-018-0517-1

Yekutieli, 2008, Hierarchical false discovery rate–controlling methodology, J Am Stat Assoc, 103, 309, 10.1198/016214507000001373

Yekutieli, 2006, Approaches to multiplicity issues in complex research in microarray analysis, Stat Neerl, 60, 414, 10.1111/j.1467-9574.2006.00343.x

Benjamini, 2005, Quantitative trait Loci analysis using the false discovery rate, Genetics, 171, 783, 10.1534/genetics.104.036699

Zehetmayer, 2005, Two-stage designs for experiments with a large number of hypotheses, Bioinformatics, 21, 3771, 10.1093/bioinformatics/bti604

Reiner-Benaim, 2007, Associating quantitative behavioral traits with gene expression in the brain: searching for diamonds in the hay, Bioinformatics, 23, 2239, 10.1093/bioinformatics/btm300

Aitchison, 1982, The statistical analysis of compositional data (with discussion), J R Stat Soc Series B Stat Methodol, 44, 139

Billheimer, 2001, Statistical interpretation of species composition, J Am Stat Assoc, 96, 1205, 10.1198/016214501753381850

Grantham, 2019, MIMIX: a Bayesian mixed-effects model for microbiome data from designed experiments, J Am Stat Assoc, 1, 10.1080/01621459.2019.1626242

Li, 2019

Xia, 2018

Principal Coordinates Analysis, Encyclopedia of Biostatistics, 2016, Zero-inflated beta regression for differential abundance analysis with metagenomics data, J Comput Biol, 23, 102, 10.1089/cmb.2015.0157

Chen, 2016, A two-part mixed-effects model for analyzing longitudinal microbiome compositional data, Bioinformatics, 32, 2611, 10.1093/bioinformatics/btw308

Chai, 2018, A marginalized two-part Beta regression model for microbiome compositional data, PLoS Comput Biol, 14, 10.1371/journal.pcbi.1006329

Bourke, 2019, Cotrimoxazole reduces systemic inflammation in HIV infection by altering the gut microbiome and immune activation, Sci Transl Med, 11, 10.1126/scitranslmed.aav0537

Nolan-Kenney, 2019, The association between smoking and gut microbiome in Bangladesh, Nicotin Tob Res, 10.1093/ntr/ntz220

Zhang, 2010, Nearly unbiased variable selection under minimax concave penalty, Ann Stat, 38, 894, 10.1214/09-AOS729

Randolph, 2018, Kernel-penalized regression for analysis of microbiome data, Ann Appl Stat, 12, 540, 10.1214/17-AOAS1102

Coker, 2020, Specific class of intrapartum antibiotics relates to maturation of the infant gut microbiota: a prospective cohort study, BJOG, 127, 217, 10.1111/1471-0528.15799

Hoen, 2018, Sex-specific associations of infants’ gut microbiome with arsenic exposure in a US population, Sci Rep, 8, 12627, 10.1038/s41598-018-30581-9

Banerjee, 2019, An adaptive multivariate two-sample test with application to microbiome differential abundance analysis, Front Genet, 10, 350, 10.3389/fgene.2019.00350

Sohn, 2015, A robust approach for identifying differentially abundant features in metagenomic samples, Bioinformatics (Oxford, England), 31, 2269, 10.1093/bioinformatics/btv165

Cao, 2017, Two-sample tests of high-dimensional means for compositional data, Biometrika, 105, 115, 10.1093/biomet/asx060

Mishra, 2019

Aitchison, 1984, Log contrast models for experiments with mixtures, Biometrika, 71, 323, 10.1093/biomet/71.2.323

Combettes, 2019

Martins, 1997, Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data, Am Nat, 149, 646, 10.1086/286013

Liu, 2019

Tanaseichuk, 2013, Phylogeny-based classification of microbial communities, Bioinformatics, 30, 449, 10.1093/bioinformatics/btt700

Peters, 2019, The microbiome in lung cancer tissue and recurrence-free survival, Cancer Epidemiol Biomark Prev, 28, 731, 10.1158/1055-9965.EPI-18-0966

Diggle, 2002

Fitzmaurice, 2004

Fabregat-Traver, 2014, High-performance mixed models based genome-wide association analysis with omicABEL software, F1000Res, 3, 200, 10.12688/f1000research.4867.1

Zhao, 2019, Data analysis of MS-based clinical lipidomics studies with crossover design: a tutorial mini-review of statistical methods, Clin Mass Spectrom, 13, 5, 10.1016/j.clinms.2019.05.002

Cho, 2012, Antibiotics in early life alter the murine colonic microbiome and adiposity, Nature, 488, 621, 10.1038/nature11400

Cox, 2014, Altering the intestinal microbiota during a critical developmental window has lasting metabolic consequences, Cell, 158, 705, 10.1016/j.cell.2014.05.052

Ruan, 2006, Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors, Bioinformatics, 22, 2532, 10.1093/bioinformatics/btl417

Xia, 2011, Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates, BMC Syst Biol, 5, S15, 10.1186/1752-0509-5-S2-S15

Xia, 2013, Efficient statistical significance approximation for local similarity analysis of high-throughput time series data, Bioinformatics, 29, 230, 10.1093/bioinformatics/bts668

Bucci, 2016, MDSINE: Microbial Dynamical Systems INference Engine for microbiome time-series analyses, Genome Biol, 17, 121, 10.1186/s13059-016-0980-6

Baksi, 2018, 'TIME': a web application for obtaining insights into microbial ecology using longitudinal microbiome data, Front Microbiol, 9, 36, 10.3389/fmicb.2018.00036

Shields-Cutler, 2018, SplinectomeR enables group comparisons in longitudinal microbiome studies, Front Microbiol, 9, 785, 10.3389/fmicb.2018.00785

Zhang, 2013, Principal trend analysis for time-course data with applications in genomic medicine, Ann Appl Stat, 7, 2205, 10.1214/13-AOAS659

Holter, 2001, Dynamic modeling of gene expression data, Proc Natl Acad Sci USA, 98, 1693, 10.1073/pnas.98.4.1693

Kimeldorf, 1970, A correspondence between Bayesian estimation on stochastic processes and smoothing by splines, Ann Math Stat, 41, 495, 10.1214/aoms/1177697089

Ilan, 2019, Why targeting the microbiome is not so successful: can randomness overcome the adaptation that occurs following gut manipulation?, Clin Exp Gastroenterol, 12, 209, 10.2147/CEG.S203823

Fu, 2015, The gut microbiome contributes to a substantial proportion of the variation in blood lipids, Circ Res, 117, 817, 10.1161/CIRCRESAHA.115.306807

Liu, 2016, A zero-inflated Poisson model for insertion tolerance analysis of genes based on Tn-seq data, Bioinformatics, 32, 1701, 10.1093/bioinformatics/btw061

Zhang, 2018, Negative binomial mixed models for analyzing longitudinal microbiome data, Front Microbiol, 9, 1683, 10.3389/fmicb.2018.01683

Lee, 2018, A Bayesian semiparametric regression model for joint analysis of microbiome data, Front Microbiol, 9, 522, 10.3389/fmicb.2018.00522

Gregory, 2016, Influence of maternal breast milk ingestion on acquisition of the intestinal microbiome in preterm infants, Microbiome, 4, 68, 10.1186/s40168-016-0214-x

Fang, 2016, Zero-inflated negative binomial mixed model: an application to two microbial organisms important in oesophagitis, Epidemiol Infect, 144, 2447, 10.1017/S0950268816000662

Zhang, 2016, Zero-inflated negative binomial regression for differential abundance testing in microbiome studies, J Bioinf Genomics, 2, 2

Chen, 2017, An omnibus test for differential distribution analysis of microbiome sequencing data, Bioinformatics, 34, 643, 10.1093/bioinformatics/btx650

D’Agata, 2019, Effects of early life NICU stress on the developing gut microbiome, Dev Psychobiol, 61, 650, 10.1002/dev.21826

Gorshein, 2017, Lactobacillus rhamnosus GG probiotic enteric regimen does not appreciably alter the gut microbiome or provide protection against GVHD after allogeneic hematopoietic stem cell transplantation, Clin Transplant, 31, 10.1111/ctr.12947

Sitarik, 2018, Dog introduction alters the home dust microbiota, Indoor Air, 28, 539, 10.1111/ina.12456

Zhai, 2019, Exact variance component tests for longitudinal microbiome studies, Genet Epidemiol, 43, 250, 10.1002/gepi.22185

Brooks, 2017, glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling, R Journal, 9, 378, 10.32614/RJ-2017-066

Bokulich, 2018, q2-longitudinal: longitudinal and paired-sample analyses of microbiome data, mSystems, 3, 10.1128/mSystems.00219-18

Guijarro, 2018, Soil microbial communities and glyphosate decay in soils with different herbicide application history, Sci Total Environ, 634, 974, 10.1016/j.scitotenv.2018.03.393

Mahnert, 2018, Enriching beneficial microbial diversity of indoor plants and their surrounding built environment with biostimulants, Front Microbiol, 9, 2985, 10.3389/fmicb.2018.02985

Cristianini, 2000

Lin, 1997, Variance component testing in generalised linear models with random effects, Biometrika, 84, 309, 10.1093/biomet/84.2.309

Plantinga, 2019, pldist: ecological dissimilarities for paired and longitudinal microbiome association analysis, Bioinformatics, 35, 3567, 10.1093/bioinformatics/btz120

Gower, 1971, A general coefficient of similarity and some of its properties, Biometrics, 27, 857, 10.2307/2528823

Williams, 2019, microbiomeDASim: simulating longitudinal differential abundance for microbiome data [version 1; peer review: 1 approved, 1 approved with reservations], F1000Res, 8, 1769, 10.12688/f1000research.20660.1

Foster, 1999, Actinobacillus seminis as a cause of abortion in a UK sheep flock, Vet Rec, 144, 479, 10.1136/vr.144.17.479

Osaka, 2017, Meta-analysis of fecal microbiota and metabolites in experimental colitic mice during the inflammatory and healing phases, Nutrients, 9, E1329, 10.3390/nu9121329

Smith, 2018, Reproduction in domestic ruminants during the past 50 yr: discovery to application, J Anim Sci, 96, 2952, 10.1093/jas/sky139

Raes, 2008, Molecular eco-systems biology: towards an understanding of community function, Nat Rev Microbiol, 6, 693, 10.1038/nrmicro1935