phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data
Tóm tắt
Từ khóa
Tài liệu tham khảo
ML Metzker, 2010, Sequencing technologies - the next generation, Nature Reviews Genetics, 11, 31, 10.1038/nrg2626
M Hamady, 2008, Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex, Nature Methods, 5, 235, 10.1038/nmeth.1184
NR Pace, 1997, A molecular view of microbial diversity and the biosphere, Science, 276, 734, 10.1126/science.276.5313.734
Z Liu, 2008, Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers, Nucleic Acids Research, 36, e120, 10.1093/nar/gkn491
TZ DeSantis, 2006, NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes, Nucleic Acids Research, 34, W394, 10.1093/nar/gkl244
TZ DeSantis, 2006, Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB, Applied and Environ-mental Microbiology, 72, 5069, 10.1128/AEM.03006-05
JR Cole, 2009, The Ribosomal Database Project: improved alignments and new tools for rRNA analysis, Nucleic Acids Research, 37, D141, 10.1093/nar/gkn879
E Pruesse, 2007, SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB, Nucleic Acids Research, 35, 7188, 10.1093/nar/gkm864
W Li, 2006, CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22, 1658, 10.1093/bioinformatics/btl158
Y Huang, 2010, CD-HIT Suite: a web server for clustering and comparing biological sequences, Bioinformatics, 26, 680, 10.1093/bioinformatics/btq003
J Caporaso, 2010, QIIME allows analysis of high-throughput community sequencing data, Nature methods, 7, 335, 10.1038/nmeth.f.303
PD Schloss, 2009, Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities, Applied and Environmental Microbiology, 75, 7537, 10.1128/AEM.01541-09
A Giongo, 2010, PANGEA: pipeline for analysis of next generation amplicons, The ISME Journal, 4, 852, 10.1038/ismej.2010.16
V Kunin, 2010, PyroTagger: A fast, accurate pipeline for analysis of rRNA amplicon pyrosequence data, The Open Journal
SV Angiuoli, 2011, CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing, BMC Bioinformatics, 12, 356, 10.1186/1471-2105-12-356
2011, The Genboree Microbiome Toolset and the Analysis of 16S rRNA Microbial Sequences. biotconf.org
QIIME EC2 image documentation. Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://qiime.org/svn_documentation/tutorials/working_with_aws.html" xlink:type="simple">http://qiime.org/svn_documentation/tutorials/working_with_aws.html</ext-link>. Accessed 2013 March 22.
University of Colorado Boulder Knight Lab. n3phele bioinformatics in the cloud. Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://www.n3phele.com/" xlink:type="simple">http://www.n3phele.com/</ext-link>. Accessed 2013 March 22.
F Meyer, 2008, The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes, BMC Bioinformatics, 9, 386, 10.1186/1471-2105-9-386
JC Venter, 1998, Shotgun sequencing of the human genome, Science, 280, 1540, 10.1126/science.280.5369.1540
R Fleischmann, 1995, Whole-genome random sequencing and assembly of Haemophilus inuenzae Rd, Science, 269, 496, 10.1126/science.7542800
JC Venter, 2004, Environmental genome shotgun sequencing of the sargasso sea, Science, 304, 66, 10.1126/science.1093857
TJ Sharpton, 2011, PhylOTU: a high-throughput procedure quantifies microbial community diversity and resolves novel taxa from metagenomic data, PLoS computational biology, 7, e1001061, 10.1371/journal.pcbi.1001061
R Development Core Team (2011) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
Stroustrup B (2000) The C++ programming language. ISBN 0201700735. Addison-Wesley Pro-fessional, 3rd edition.
Simpson GL. CRAN Task View: Analysis of Ecological and Environmental Data. Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://cran.r-project.org/web/views/Environmetrics.html" xlink:type="simple">http://cran.r-project.org/web/views/Environmetrics.html</ext-link>. Accessed 2013 March 22.
J Chakerian, 2010, distory: Distances between trees
KP Schliep, 2011, phangorn: phylogenetic analysis in R, Bioinformatics, 27, 592, 10.1093/bioinformatics/btq706
SW Kembel, 2010, Picante: R tools for integrating phylogenies and ecology, Bioinformatics, 26, 1463, 10.1093/bioinformatics/btq166
PJ McMurdie, 2012, phyloseq: A Bioconductor Package for Handling and Analysis of High-Throughput Phylogenetic Sequence Data, Pacific Symposium on Biocomputing, 17, 235
Hardle W, Ronz B, editors (2002) Sweave. Dynamic generation of statistical reports using literate data analysis. Compstat 2002, Proceedings in Computational Statistics.
Y Xie, 2012, knitr: A general-purpose package for dynamic report generation in R, R package version 0.8
RC Gentleman, 2004, Bioconductor: open software development for computational biology and bioinformatics, Genome Biology, 5, R80, 10.1186/gb-2004-5-10-r80
D Beck, 2011, OTUbase: an R infrastructure package for operational taxo-nomic unit data, Bioinformatics
OTUbase Bioconductor Release Page. (2012) Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://www.bioconductor.org/packages/release/bioc/html/OTUbase.html" xlink:type="simple">http://www.bioconductor.org/packages/release/bioc/html/OTUbase.html</ext-link>. Accessed 2013 March 22.
D McDonald, 2012, The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome, Giga Science
McMurdie PJ, Holmes S. Package manual for phyloseq. Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://bioconductor.org/packages/devel/bioc/manuals/phyloseq/man/phyloseq.pdf" xlink:type="simple">http://bioconductor.org/packages/devel/bioc/manuals/phyloseq/man/phyloseq.pdf</ext-link>. Accessed 2013 March 22.
The phyloseq Homepage. Available: joey711.github.com/phyloseq/. Accessed 2013 March 22.
2012, Writing R Extensions, Comprehensive R Archive Network
Wickham H, Danenberg P, Eugster M. roxygen2: In-source documentation for R. R package version 2.2.2. Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://cran.r-project.org/web/packages/roxygen2/index.html" xlink:type="simple">http://cran.r-project.org/web/packages/roxygen2/index.html</ext-link>. Accessed 2013 March 22.
D Faith, 1987, Compositional dissimilarity as a robust measure of ecological distance, Vegetatio, 69, 57, 10.1007/BF00038687
MJ Anderson, 2006, Multivariate dispersion as a measure of beta diversity, Ecology Letters, 9, 683, 10.1111/j.1461-0248.2006.00926.x
M Hamady, 2009, Fast unifrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and phylochip data, The ISME Journal
CA Lozupone, 2007, Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities, Applied and Environmental Microbiology, 73, 1576, 10.1128/AEM.01996-06
C Lozupone, 2005, UniFrac: a new phylogenetic method for comparing microbial communities, Applied and Environmental Microbiology, 71, 8228, 10.1128/AEM.71.12.8228-8235.2005
JG Caporaso, 2011, Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample, Proceedings of the National Academy of Sciences, 108, 4516, 10.1073/pnas.1000080107
Greenacre MJ (1984) Theory and Applications of Correspondence Analysis. London: Academic Press.
CJF Ter Braak, 1986, Canonical Correspondence Analysis: A new eigenvector technique for multivariate direct gradient analysis, Ecology, 67, 1167, 10.2307/1938672
M Hill, 1980, Detrended Correspondence Analysis, an improved ordination technique, Vegetatio, 42, 47, 10.1007/BF00048870
AL Wollenberg, 1977, Redundancy analysis an alternative for canonical correlation analysis, Psychometrika, 42, 207, 10.1007/BF02294050
H Hotelling, 1933, Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology, 24, 417, 10.1037/h0071325
S Pavoine, 2004, From dissimilarities among species to dissimilarities among communities: a double principal coordinate analysis, Journal of Theoretical Biology, 228, 523, 10.1016/j.jtbi.2004.02.014
JC Gower, 1966, Some distance properties of latent root and vector methods used in multivariate analysis, Biometrika, 53, 325, 10.1093/biomet/53.3-4.325
PR Minchin, 1987, An evaluation of the relative robustness of techniques for ecological ordination, Vegetatio, 69, 89, 10.1007/BF00038690
J Thioulouse, 2011, Simultaneous analysis of a sequence of paired ecological tables: A comparison of several methods, Annals of Applied Statistics, 5, 2300, 10.1214/10-AOAS372
Wilkinson L, Wills G (2005) The Grammar Of Graphics. Statistics and Computing. Springer, 2nd edition.
S Rajaram, 2010, NeatMap–non-clustering heat map alternatives in R, BMC Bioinformatics, 11, 45, 10.1186/1471-2105-11-45
G Csardi, 2006, The igraph software package for complex network research, InterJournal Complex Systems, 1695
Tufte ER (2001) The visual display of quantitative information, Graphics Press, Cheshire, Con-necticut, chapter 9 Aesthetics and Technique in Data Graphical Design. 2nd edition, p. 178.
AJ Pinto, 2012, PCR Biases Distort Bacterial and Archaeal Community Structure in Pyrosequencing Datasets, PLoS ONE, 7, e43093, 10.1371/journal.pone.0043093
HL Sanders, 1968, Marine benthic diversity: A comparative study, The American Naturalist, 102, 243, 10.1086/282541
S Holmes, 2011, Visualization and statisti-cal comparisons of microbial communities using R packages on phylochip data, Pacific Symposium on Biocomputing, 142
DB Allison, 2006, Microarray Data Analysis: from Disarray to Consolidation and Consensus, Nat Rev Genet, 7, 55, 10.1038/nrg1749
S Holmes, 2012, Statistical analysis challenges in the microbiome, To appear PNAS: The Social Biology of Microbial Communities forum on Microbial Threats
T Nelson, 2010, Shifts in luminal and mucosal microbial communities associated with an experimental model of irritable bowel syndrome, Gastroenterology
S Holmes, 2003, Bootstrapping phylogenetic trees: theory and methods, Statistical Science, 241, 10.1214/ss/1063994979
PH Westfall, 1993, Resampling-Based Multiple Testing. Examples and Methods for P-Value Adjustment, Wiley-Interscience
KS Pollard, 2010, multtest: Resampling-based multiple hypothesis testing, R package version 2.4.0
JPA Ioannidis, 2005, Why most published research findings are false, PLoS medicine, 2, e124, 10.1371/journal.pmed.0020124
Z Merali, 2010, Computational science: Error, why scientific programming does not compute, Nature, 467, 775
RD Peng, 2011, Reproducible research in computational science, Science, 334, 1226, 10.1126/science.1213847
Carey VJ, Stodden V (2010) Reproducible Research Concepts and Tools for Cancer Bioinformatics. In: Ochs MF, Casagrande JT, Davuluri RV, editors, Biomedical Informatics for Cancer Research, Boston, MA: Springer US. pp. 149–175.
R Knight, 2012, Unlocking the potential of metage-nomics through replicated experimental design, Nature biotechnology, 30, 513, 10.1038/nbt.2235
2012, Structure, function and diversity of the healthy human microbiome, Nature, 486, 207, 10.1038/nature11234
DL Donoho, 2010, An invitation to reproducible computational research, Biostatistics (Oxford, England), 11, 385, 10.1093/biostatistics/kxq028
RD Peng, 2009, Reproducible research and Biostatistics, Biostatistics (Oxford, England), 10, 405, 10.1093/biostatistics/kxp014
R Gentleman, 2004, Statistical analyses and reproducible research, Bioconductor Project Working Papers, 2
F Pérez, 2007, IPython: a System for Interactive Scientific Computing, Comput Sci Eng, 9, 21, 10.1109/MCSE.2007.53
Allaire J, Horner J, Marti V, Porte N The markdown package: Markdown rendering for R. R package version 0.5.4. Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://CRAN.R-project.org/package=markdown" xlink:type="simple">http://CRAN.R-project.org/package=markdown</ext-link>. Accessed 2013 March 22.
R Gentleman, 2005, Reproducible research: a bioinformatics case study, Statistical applications in genetics and molecular biology, 4, Article2, 10.2202/1544-6115.1034
The phyloseq Demo Repository. Available: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/joey711/phyloseq-demo" xlink:type="simple">https://github.com/joey711/phyloseq-demo</ext-link>. Accessed 2013 March 22.
WK Copeland, 2012, mcaGUI: microbial commu-nity analysis R-Graphical User Interface (GUI), Bioinformatics (Oxford, England), 28, 2198, 10.1093/bioinformatics/bts338
H Wickham, 2007, Reshaping data with the reshape package, Journal of Statistical Software, 21, 1, 10.18637/jss.v021.i12
H Wickham, 2011, The split-apply-combine strategy for data analysis, Journal of Statistical Software, 40, 1, 10.18637/jss.v040.i01
J Oksanen, 2011, vegan: Community Ecology Package, R package version 1.17–10