Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons
Tóm tắt
The importance of 16S rRNA gene amplicon profiles for understanding the influence of microbes in a variety of environments coupled with the steep reduction in sequencing costs led to a surge of microbial sequencing projects. The expanding crowd of scientists and clinicians wanting to make use of sequencing datasets can choose among a range of multipurpose software platforms, the use of which can be intimidating for non-expert users. Among available pipeline options for high-throughput 16S rRNA gene analysis, the R programming language and software environment for statistical computing stands out for its power and increased flexibility, and the possibility to adhere to most recent best practices and to adjust to individual project needs. Here we present the Rhea pipeline, a set of R scripts that encode a series of well-documented choices for the downstream analysis of Operational Taxonomic Units (OTUs) tables, including normalization steps,
Từ khóa
Tài liệu tham khảo
Anderson, 2001, A new method for non-parametric multivariate analysis of variance, Austral Ecology, 26, 32, 10.1111/j.1442-9993.2001.01070.pp.x
Bálint, 2016, Millions of reads, thousands of taxa: microbial community structure and associations analyzed via marker genes, FEMS Microbiology Reviews, fuw017, 10.1093/femsre/fuw017
Benjamini, 1995, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B (Methodological), 57, 289, 10.1111/j.2517-6161.1995.tb02031.x
Bourgon, 2010, Independent filtering increases detection power for high-throughput experiments, Proceedings of the National Academy of Sciences of the United States of America, 107, 9546, 10.1073/pnas.0914005107
Bray, 1957, An ordination of the upland forest communities of southern Wisconsin, Ecological Monographs, 27, 325, 10.2307/1942268
Caporaso, 2010, QIIME allows analysis of high-throughput community sequencing data, Nature Methods, 7, 335, 10.1038/nmeth.f.303
Chen, 2012, Associating microbiome composition with environmental covariates using generalized UniFrac distances, Bioinformatics, 28, 2106, 10.1093/bioinformatics/bts342
Clavel, 2016, Microbiome sequencing: challenges and opportunities for molecular medicine, Expert Review of Molecular Diagnostics, 16, 795, 10.1080/14737159.2016.1184574
Edgar, 2010, Search and clustering orders of magnitude faster than BLAST, Bioinformatics, 26, 2460, 10.1093/bioinformatics/btq461
Edgar, 2013, UPARSE: highly accurate OTU sequences from microbial amplicon reads, Nature Methods, 10, 996, 10.1038/nmeth.2604
Feise, 2002, Do multiple outcome measures require p-value adjustment?, BMC Medical Research Methodology, 2, 1, 10.1186/1471-2288-2-1
Fisher, 1950, Statistical methods for research workers
Glassing, 2016, Inherent bacterial DNA contamination of extraction and sequencing reagents may affect interpretation of microbiota in low bacterial biomass samples, Gut Pathogens, 8, 10.1186/s13099-015-0083-z
Hiergeist, 2016, Multicenter quality assessment of 16S ribosomal DNA-sequencing for microbiome analyses reveals high inter-center variability, International Journal of Medical Microbiology, 306, 334, 10.1016/j.ijmm.2016.03.005
Hildebrand, 2014, LotuS: an efficient and user-friendly OTU processing pipeline, Microbiome, 2, 10.1186/2049-2618-2-1
Hollander, 2013, Nonparametric statistical methods
Jost, 2007, Partitioning diversity into independent alpha and beta components, Ecology, 88, 2427, 10.1890/06-1736.1
Lagkouvardos, 2016, IMNGS: a comprehensive open resource of processed 16S rRNA microbial profiles for ecology and diversity studies, Scientific Reports, 6, 10.1038/srep33721
Lagkouvardos, 2015, Gut metabolites and bacterial community networks during a pilot intervention study with flaxseeds in healthy adult men, Molecular Nutrition & Food Research, 59, 1614, 10.1002/mnfr.201500125
Lozupone, 2007, Quantitative and qualitative β diversity measures lead to different insights into factors that structure microbial communities, Applied and Environmental Microbiology, 73, 1576, 10.1128/AEM.01996-06
Martínez, 2013, Long-term temporal analysis of the human fecal microbiota revealed a stable core of dominant bacterial species, PLoS ONE, 8, e69621, 10.1371/journal.pone.0069621
McMurdie, 2013, phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data, PLoS ONE, 8, e61217, 10.1371/journal.pone.0061217
McMurdie, 2014, Waste not, want not: why rarefying microbiome data is inadmissible, PLoS Computational Biology, 10, e1003531, 10.1371/journal.pcbi.1003531
McMurdie, 2015, Shiny-phyloseq: web application for interactive microbiome analysis with provenance tracking, Bioinformatics, 31, 282, 10.1093/bioinformatics/btu616
Minchin, 1987, An evaluation of the relative robustness of techniques for ecological ordination, Vegetatio, 69, 89, 10.1007/BF00038690
Müller, 2016, Gut barrier impairment by high-fat diet in mice depends on housing conditions, Molecular Nutrition & Food Research, 60, 897, 10.1002/mnfr.201500775
Murtagh, 2014, Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion?, Journal of Classification, 31, 274, 10.1007/s00357-014-9161-z
Pearson, 1909, Determination of the coefficient of correlation, Science, 30, 23, 10.1126/science.30.757.23
R Core Team, 2013, R: a language and environment for statistical computing
Schaubeck, 2015, Dysbiotic gut microbiota causes transmissible Crohn’s disease-like ileitis independent of failure in antimicrobial defence, Gut, 65, 225, 10.1136/gutjnl-2015-309333
Schloss, 2009, Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities, Applied and Environmental Microbiology, 75, 7537, 10.1128/AEM.01541-09
Shannon, 2001, A mathematical theory of communication, ACM SIGMOBILE Mobile Computing and Communications Review, 5, 3, 10.1145/584091.584093
Sinha, 2015, The microbiome quality control project: baseline study design and future directions, Genome Biology, 16, 10.1186/s13059-014-0572-2
Walker, 2015, 16S rRNA gene-based profiling of the human infant gut microbiota is strongly influenced by sample processing and PCR primer choice, Microbiome, 3, 10.1186/s40168-014-0066-1