Compositional data analysis of the microbiome: fundamentals, tools, and challenges

Annals of Epidemiology - Tập 26 - Trang 330-335 - 2016
Matthew C.B. Tsilimigras1, Anthony A. Fodor1
1Department of Bioinformatics and Genomics, UNC Charlotte, Bioinformatics Building, The University of North Carolina, Charlotte 9201, University City Blvd, Charlotte

Tài liệu tham khảo

Bacon-Shone, 2011, A short history of compositional data analysis, 3 Pearson, 1897, Mathematical contributions to the Theory of Evolution—on a form of spurious correlation which may arise when indices are used in the measurement of organs, Proc R Soc Lond, 60, 489, 10.1098/rspl.1896.0076 Aitchison, 1986 Campbell, 2009, Compositional data analysis for elemental data in forensic science, Forensic Sci Int, 188, 81, 10.1016/j.forsciint.2009.03.018 Neocleous, 2011, Transformations for compositional data with zeros with an application to forensic evidence evaluation, Chemometer Intell Lab, 109, 77, 10.1016/j.chemolab.2011.08.003 Pennington, 2009, Analysis of compositional data in communication disorders research, J Commun Disord, 42, 18, 10.1016/j.jcomdis.2008.06.002 Faes, 2011, Analysing the composition of outpatient antibiotic use: a tutorial on compositional data analysis, J Antimicrob Chemother, 66, vi89, 10.1093/jac/dkr461 Leite, 2014, Applying compositional data methodology to nutritional epidemiology, Stat Methods Med Res Caporaso, 2010, QIIME allows analysis of high-throughput community sequencing data, Nat Methods, 7, 335, 10.1038/nmeth.f.303 Kuczynski, 2012, Using QIIME to Analyze 16S rRNA Gene Sequences from Microbial Communities, Curr Protoc Microbiol, Chapter 1, 1 Faust, 2012, Microbial co-occurrence relationships in the Human Microbiome, PLoS Comput Biol, 8, e1002606, 10.1371/journal.pcbi.1002606 Jackson, 1997, Compositional data in community ecology: the paradigm or peril of proportions?, Ecology, 78, 929, 10.1890/0012-9658(1997)078[0929:CDICET]2.0.CO;2 Li, 2015, Microbiome, Metagenomics and High-Dimensional Compositional Data Analysis, Annu Rev Stat Its Appl, 2, 73, 10.1146/annurev-statistics-010814-020351 Kurtz, 2015, Sparse and compositionally robust inference of microbial ecological networks, PLoS Comput Biol, 11, e1004226, 10.1371/journal.pcbi.1004226 Finucane, 2014, A taxonomic signature of obesity in the microbiome? Getting to the guts of the matter, PLoS One, 9, e84689, 10.1371/journal.pone.0084689 Fernandes, 2014, Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis, Microbiome, 2, 15, 10.1186/2049-2618-2-15 Egozcue, 2003, Isometric logratio transformations for compositional data analysis, Math Geol, 35, 279, 10.1023/A:1023818214614 McMurdie, 2014, Waste not, want not: why rarefying microbiome data is inadmissible, PLoS Comput Biol, 10, e1003531, 10.1371/journal.pcbi.1003531 Weiss, 2015, Effects of library size variance, sparsity, and compositionality on the analysis of microbiome data, PeerJ Prepr, 3, e1408 Love, 2014, Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2, Genome Biol, 15, 550, 10.1186/s13059-014-0550-8 Anders, 2010, Differential expression analysis for sequence count data, Genome Biol, 11, R106, 10.1186/gb-2010-11-10-r106 Kumar, 2014, Getting started with microbiome analysis: sample acquisition to bioinformatics, Curr Protoc Hum Genet, 82, 1 Salter, 2014, Reagent and laboratory contamination can critically impact sequence-based microbiome analyses, BMC Biol, 12, 87, 10.1186/s12915-014-0087-z Aitchison, 2003, A concise guide to compositional data analysis, CDA work, Girona, 24, 73 Jespers, 2012, Quantification of bacterial species of the vaginal microbiome in different groups of women, using nucleic acid amplification tests, BMC Microbiol, 12, 83, 10.1186/1471-2180-12-83 Gloor, 2010, Microbiome profiling by illumina sequencing of combinatorial sequence-tagged PCR products, PLoS One, 5, e15406, 10.1371/journal.pone.0015406 Poretsky, 2014, Strengths and limitations of 16S rRNA gene amplicon sequencing in revealing temporal microbial community dynamics, PLoS One, 9, e93827, 10.1371/journal.pone.0093827 Lucas, 2006, Sparse statistical modelling in gene expression genomics, 1 Little, 2002 Martín-Fernández, 2011, Dealing with Zeros van den Boogaart, 2013 Martín-Fernández, 2012, Model-based replacement of rounded zeros in compositional data: classical and robust approaches, Comput Stat Data Anal, 56, 2688, 10.1016/j.csda.2012.02.012 Filzmoser, 2012, Interpretation of multivariate outliers for compositional data, Comput Geosci, 39, 77, 10.1016/j.cageo.2011.06.014 Palarea-Albaladejo, 2015, zCompositions—R package for multivariate imputation of left-censored data under a compositional approach, Chemometer Intell Lab, 143, 85, 10.1016/j.chemolab.2015.02.019 Aitchison, 2003, Possible solutions of some essential zero problems in compositional data analysis, Compos Data Anal Work Girona, 2003, 6 Bacon-Shone, 2008, Discrete and continuous compositions Martín-Fernández, 2003, Dealing with zeros and missing values in compositional data sets using nonparametric imputation, Math Geol, 35, 253, 10.1023/A:1023866030544 Zuur, 2009 Paulson, 2013, Differential abundance analysis for microbial marker-gene surveys, Nat Methods, 10, 1200, 10.1038/nmeth.2658 Langille, 2013, Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences, Nat Biotechnol, 31, 814, 10.1038/nbt.2676 Pärtel, 2014, Community ecology of absent species: hidden and dark diversity, J Veg Sci, 25, 1154 Pawlowsky-Glahn, 2015 Carr, 2013, Reconstructing the genomic content of microbiome taxa through shotgun metagenomic deconvolution, PLoS Comput Biol, 9, e1003292, 10.1371/journal.pcbi.1003292 Brown, 2015, Quorum-sensing dysbiotic shifts in the HIV-infected oral metabiome, PLoS One, 10, e0123880, 10.1371/journal.pone.0123880 Duran-Pinedo, 2011, Correlation network analysis applied to complex biofilm communities, PLoS One, 6, e28438, 10.1371/journal.pone.0028438 Fisher, 2014, Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression, PLoS One, 9, e102451, 10.1371/journal.pone.0102451 Friedman, 2012, Inferring correlation networks from genomic survey data, PLoS Comput Biol, 8, e1002687, 10.1371/journal.pcbi.1002687 Mcdonald, 2015, Context and the human microbiome, Microbiome, 3, 52, 10.1186/s40168-015-0117-2 Wagner, 2011, Application of two-part statistics for comparison of sequence variant counts, PLoS One, 6, e20296, 10.1371/journal.pone.0020296 Xiao, 2015, A catalog of the mouse gut metagenome, Nat Biotechnol, 33, 1103, 10.1038/nbt.3353 La Rosa, 2012, Hypothesis testing and power calculations for taxonomic-based human microbiome data, PLoS One, 7, e52078, 10.1371/journal.pone.0052078 Sinclair, 2015, Microbial community composition and diversity via 16S rRNA gene amplicons: evaluating the illumina platform, PLoS One, 10, e0116955, 10.1371/journal.pone.0116955