Microbiome Datasets Are Compositional: And This Is Not Optional

Gregory B. Gloor1, Jean M. Macklaim1, Vera Pawlowsky‐Glahn2, Juan José Egozcue3
1Department of Biochemistry, University of Western Ontario, Canada
2Departments of Computer Science, Applied Mathematics, and Statistics, Universitat de Girona, Spain
3Department of Applied Mathematics, Universitat Politècnica de Catalunya, Spain

Tóm tắt

Từ khóa


Tài liệu tham khảo

Aitchison, 1983, Principal component analysis of compositional data, Biometrika, 70, 57, 10.1093/biomet/70.1.57

Aitchison, 1986, The Statistical Analysis of Compositional Data, 10.1007/978-94-009-4109-0

Aitchison, 2000, Logratio analysis and compositional distance, Math. Geol., 32, 271, 10.1023/A:1007529726302

Aitchison, 2002, Biplots of compositional data, J. Roy. Stat. Soc. Ser. C, 51, 375, 10.1111/1467-9876.00275

Anders, 2010, Differential expression analysis for sequence count data, Genome Biol., 11, R106, 10.1186/gb-2010-11-10-r106

Bian, 2017, The gut microbiota of healthy aged chinese is similar to that of the healthy young, mSphere, 2, e00327, 10.1128/mSphere.00327-17

Erb, 2016, How should we measure proportionality on relative gene expression data?, Theory Biosci., 135, 21, 10.1007/s12064-015-0220-8

Fernandes, 2013, ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-seq, PLoS ONE, 8, e67019, 10.1371/journal.pone.0067019

Fernandes, 2014, Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis, Microbiome, 2, 15.1, 10.1186/2049-2618-2-15

Friedman, 2012, Inferring correlation networks from genomic survey data, PLoS Comput. Biol., 8, e1002687, 10.1371/journal.pcbi.1002687

Gloor, , Compositional uncertainty should not be ignored in high-throughput sequencing data analysis, Aust. J. Stat., 45, 73, 10.17713/ajs.v45i4.122

Gloor, 2016, Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data, Can. J. Microbiol., 62, 692, 10.1139/cjm-2015-0821

Gloor, , It's all relative: analyzing microbiome data as compositions, Ann. Epidemiol., 26, 322, 10.1016/j.annepidem.2016.03.003

Gorvitovskaia, 2016, Interpreting prevotella and bacteroides as biomarkers of diet and lifestyle, Microbiome, 4, 15, 10.1186/s40168-016-0160-7

Hawinkel, 2017, A broken promise: microbiome differential abundance methods do not control the false discovery rate, Brief. Bioinf., bbx104, 10.1093/bib/bbx104

Jaynes, 2003, Probability Theory: The Logic of Science, 10.1017/CBO9780511790423

Kurtz, 2015, Sparse and compositionally robust inference of microbial ecological networks, PLoS Comput. Biol., 11, e1004226, 10.1371/journal.pcbi.1004226

Lovell, 2011, Proportions, percentages, ppm: do the molecular biosciences treat compositional data right, Compositional Data Analysis: Theory and Applications, 193, 10.1002/9781119976462.ch14

Lovell, 2015, Proportionality: a valid alternative to correlation for relative data, PLoS Comput. Biol., 11, e1004075, 10.1371/journal.pcbi.1004075

Lozupone, 2011, Unifrac: an effective distance metric for microbial community comparison, ISME J., 5, 169, 10.1038/ismej.2010.133

Macklaim, 2013, Comparative meta-RNA-seq of the vaginal microbiota and differential expression by Lactobacillus iners in health and dysbiosis, Microbiome, 1, 15, 10.1186/2049-2618-1-12

Mandal, 2015, Analysis of composition of microbiomes: a novel method for studying microbial composition, Microb. Ecol. Health Dis., 26, 27663, 10.3402/mehd.v26.27663

Martín-Fernández, 1998, Measures of difference for compositional data and hierarchical clustering methods, Proc. IAMG, 98, 526

McMillan, 2015, A multi-platform metabolomics approach identifies highly specific biomarkers of bacterial diversity in the vagina of pregnant and non-pregnant women, Sci. Rep., 5, 14174, 10.1038/srep14174

McMurdie, 2013, phyloseq: an r package for reproducible interactive analysis and graphics of microbiome census data, PLoS ONE, 8, e61217, 10.1371/journal.pone.0061217

McMurdie, 2014, Waste not, want not: why rarefying microbiome data is inadmissible, PLoS Comput. Biol., 10, e1003531, 10.1371/journal.pcbi.1003531

McMurrough, 2014, Control of catalytic efficiency by a coevolving network of catalytic and noncatalytic residues, Proc. Natl. Acad. Sci. U.S.A., 111, E2376, 10.1073/pnas.1322352111

Morton, 2017, Uncovering the horseshoe effect in microbial analyses, mSystems, 2, e00166, 10.1128/mSystems.00166-16

Ortego, 2013, Spurious copulas, Proceedings of the 5th Workshop on Compositional Data Analysis, CoDaWork 2013

Palarea-Albaladejo, 2015, zCompositions — R package for multivariate imputation of left-censored data under a compositional approach, Chemometr. Intel. Lab. Syst., 143, 85, 10.1016/j.chemolab.2015.02.019

Pawlowsky-Glahn, 2015, Modeling and Analysis of Compositional Data., 10.1002/9781119003144

Pearson, 1897, Mathematical contributions to the theory of evolution. – on a form of spurious correlation which may arise when indices are used in the measurement of organs, Proc. Roy. Soc. Lond., 60, 489, 10.1098/rspl.1896.0076

Quinn, 2017, propr: An R-package for identifying proportionally abundant features using compositional data analysis, bioRxiv, 10.1101/104935

Robinson, 2016, Intricacies of assessing the human microbiome in epidemiologic studies, Ann. Epidemiol., 26, 311, 10.1016/j.annepidem.2016.04.005

Robinson, 2010, A scaling normalization method for differential expression analysis of RNA-seq data, Genome Biol., 11, R25.1, 10.1186/gb-2010-11-3-r25

Shaffer, 1981, Minimum population sizes for species conservation, BioScience, 31, 131, 10.2307/1308256

Silverman, 2017, A phylogenetic transform enhances analysis of compositional microbiota data, Elife, 6, 21887, 10.7554/eLife.21887

Thorsen, 2016, Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies, Microbiome, 4, 62, 10.1186/s40168-016-0208-8

Tsilimigras, 2016, Compositional data analysis of the microbiome: fundamentals, tools, and challenges, Ann. Epidemiol., 26, 330, 10.1016/j.annepidem.2016.03.002

Van den Boogaart, 2013, Analyzing Compositional Data with R, 10.1007/978-3-642-36809-7

Weiss, 2016, Correlation detection strategies in microbial data sets vary widely in sensitivity and precision, ISME J., 10, 1669, 10.1038/ismej.2015.235

Weiss, 2017, Normalization and microbial differential abundance strategies depend upon data characteristics, Microbiome, 5, 27, 10.1186/s40168-017-0237-y

Wong, 2016, Expanding the UniFrac toolbox, PLoS ONE, 11, e0161196, 10.1371/journal.pone.0161196