A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses

FEMS Microbiology Ecology - Tập 90 Số 3 - Trang 543-550 - 2014
Pier Luigi Buttigieg1,2,3, Alban Ramette2,4
1Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
2HGF-MPG Group for Deep Sea Ecology and Technology, Bremerhaven, Germany
3MARUM Center for Marine Sciences, Bremen, Germany
4Max Planck Institute for Marine Microbiology, Bremen, Germany

Tóm tắt

Abstract The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community.

Từ khóa


Tài liệu tham khảo

Anderson, 2001, A new method for non-parametric multivariate analysis of variance, Austral Ecol, 26, 32

Bertics, 2009, Biodiversity of benthic microbial communities in bioturbated coastal sediments is controlled by geochemical microniches, ISME J, 3, 1269, 10.1038/ismej.2009.62

Bienhold, 2012, The energy-diversity relationship of complex bacterial communities in Arctic deep-sea sediments, ISME J, 6, 724, 10.1038/ismej.2011.140

Böer, 2009, Time- and sediment depth-related variations in bacterial diversity and community structure in subtidal sands, ISME J, 3, 780, 10.1038/ismej.2009.29

Borcard, 2002, All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices, Ecol Model, 153, 51, 10.1016/S0304-3800(01)00501-4

Borcard, 2011, Numerical Ecology with R, 10.1007/978-1-4419-7976-6

Caporaso, 2010, QIIME allows analysis of high-throughput community sequencing data, Nat Methods, 7, 335, 10.1038/nmeth.f.303

Clarke, 1993, Non-parametric multivariate analyses of changes in community structure, Austral Ecol, 18, 117, 10.1111/j.1442-9993.1993.tb00438.x

Coss, 2009, Pseudoreplication conventions are testable hypotheses, J Comp Psychol, 123, 444, 10.1037/a0016093

Cottenie, 2003, Comment to Oksanen (2001): reconciling Oksanen (2001) and Hurlbert (1984), Oikos, 100, 394, 10.1034/j.1600-0706.2003.11953.x

Dray, 2006, Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM), Ecol Model, 196, 483, 10.1016/j.ecolmodel.2006.02.015

Dryer, 1997, Wizards, guides, and beyond, Proceedings of the 2nd International Conference on Intelligent User Interfaces – IUI ‘97, 265, 10.1145/238218.238347

Frossard, 2012, Disconnect of microbial structure and function: enzyme activities and bacterial communities in nascent stream corridors, ISME J, 6, 680, 10.1038/ismej.2011.134

Gobet, 2010, Multivariate Cutoff Level Analysis (MultiCoLA) of large community data sets, Nucleic Acids Res, 38, e155, 10.1093/nar/gkq545

Härdle, 2007, Applied Multivariate Statistical Analysis, 2nd edn

Hartmann, 2013, Resistance and resilience of the forest soil microbiome to logging-associated compaction, ISME J, 8, 226, 10.1038/ismej.2013.141

Hurlbert, 1984, Pseudoreplication and the design of ecological field experiments, Ecol Monogr, 54, 187, 10.2307/1942661

Hurlbert, 2004, On misinterpretations of pseudoreplication and related matters: a reply to Oksanen, Oikos, 104, 591, 10.1111/j.0030-1299.2004.12752.x

Hurlbert, 2009, The ancient black art and transdisciplinary extent of pseudoreplication, J Comp Psychol, 123, 434, 10.1037/a0016221

James, 1990, Multivariate analysis in ecology and systematics: panacea or Pandora's box?, Annu Rev Ecol Syst, 21, 129, 10.1146/annurev.es.21.110190.001021

Jombart, 2009, Genetic markers in the playground of multivariate analysis, Heredity, 102, 330, 10.1038/hdy.2008.130

Karsenti, 2011, A holistic approach to marine eco-systems biology, PLoS Biol, 9, e1001177, 10.1371/journal.pbio.1001177

Koehnle, 2009, An ancient black art, J Comp Psychol, 123, 452, 10.1037/a0017435

Kopp, 2012, Spatial and temporal variation in a Caribbean herbivorous fish assemblage, J Coast Res, 278, 63, 10.2112/JCOASTRES-D-09-00165.1

Kottmann, 2010, Megx.net: integrated database resource for marine ecological genomics, Nucleic Acids Res, 38, D391, 10.1093/nar/gkp918

Kuczynski, 2012, Experimental and analytical tools for studying the human microbiome, Nat Rev Genet, 13, 47, 10.1038/nrg3129

Laliberté, 2008, Analyzing or explaining beta diversity? Comment, Ecology, 89, 3232, 10.1890/07-0201.1

Legendre, 2005, Analyzing beta diversity: partitioning the spatial variation of community composition data, Ecol Monogr, 75, 435, 10.1890/05-0549

Legendre, 2005, Species associations: the Kendall coefficient of concordance revisited, J Agric Biol Environ Stat, 10, 226, 10.1198/108571105X46642

Legendre, 2007, Studying beta diversity: ecological variation partitioning by multiple regression and canonical analysis, J Plant Ecol, 1, 3, 10.1093/jpe/rtm001

Legendre, 1999, Distance-based redundancy analysis: testing multispecies responses in multifactorial ecological experiments, Ecol Monogr, 69, 1, 10.1890/0012-9615(1999)069[0001:DBRATM]2.0.CO;2

Legendre, 2001, Ecologically meaningful transformations for ordination of species data, Oecologia, 129, 271, 10.1007/s004420100716

Legendre, 1998, Numerical Ecology, 2nd edn

Legendre, 2012, Numerical Ecology, 3rd edn

Legendre, 2008, Analyzing or explaining beta diversity? Comment, Ecology, 89, 3238, 10.1890/07-0272.1

McMurdie, 2013, phyloseq : an R package for reproducible interactive analysis and graphics of microbiome census data, PLoS ONE, 8, e61217, 10.1371/journal.pone.0061217

Nuzzo, 2014, Scientific method: statistical errors, Nature, 506, 150, 10.1038/506150a

Økland, 2007, Wise use of statistical tools in ecological field studies, Folia Geobot, 42, 123, 10.1007/BF02893879

Oksanen, 2001, Logic of experiments in ecology: is pseudoreplication a pseudoissue?, Oikos, 94, 27, 10.1034/j.1600-0706.2001.11311.x

Oksanen, 2004, The devil lies in details: reply to Stuart Hurlbert, Oikos, 104, 598, 10.1111/j.0030-1299.2004.13266.x

Oksanen, 2013, vegan: community Ecology Package. R package version 2.0-7

Pavoine, 2004, From dissimilarities among species to dissimilarities among communities: a double principal coordinate analysis, J Theor Biol, 228, 523, 10.1016/j.jtbi.2004.02.014

Pélissier, 2008, Analyzing or explaining beta diversity?, Ecology, 89, 3227, 10.1890/07-0140.1

Peres-Neto, 2006, Variation partitioning of species data matrices: estimation and comparison of fractions, Ecology, 87, 2614, 10.1890/0012-9658(2006)87[2614:VPOSDM]2.0.CO;2

Prosser, 2010, Replicate or lie, Environ Microbiol, 12, 1806, 10.1111/j.1462-2920.2010.02201.x

Prosser, 2007, The role of ecological theory in microbial ecology, Nat Rev Microbiol, 5, 384, 10.1038/nrmicro1643

R Development Core Team, 2014, R: A Language and Environment for Statistical Computing

Ramette, 2007, Multivariate analyses in microbial ecology, FEMS Microbiol Ecol, 62, 142, 10.1111/j.1574-6941.2007.00375.x

Rivers, 2013, Transcriptional response of bathypelagic marine bacterioplankton to the Deepwater Horizon oil spill, ISME J, 7, 2315, 10.1038/ismej.2013.129

RStudio Inc, 2014, shiny: Web Application Framework for R. R package version 0.10.2.1

Rusch, 2007, The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific, PLoS Biol, 5, e77, 10.1371/journal.pbio.0050077

Schank, 2009, Pseudoreplication is a pseudoproblem, J Comp Psychol, 123, 421, 10.1037/a0013579

Schloss, 2009, Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities, Appl Environ Microbiol, 75, 7537, 10.1128/AEM.01541-09

Thioulouse, 2012, Multivariate analyses in soil microbial ecology: a new paradigm, Environ Ecol Stat, 19, 499, 10.1007/s10651-012-0198-z

Tuomisto, 2006, Analyzing or explaining beta diversity? Understanding the targets of different methods of analysis, Ecology, 87, 2697, 10.1890/0012-9658(2006)87[2697:AOEBDU]2.0.CO;2

Tuomisto, 2008, Analyzing or explaining beta diversity? Reply, Ecology, 89, 3244, 10.1890/08-1247.1

Warton, 2011, Regularized sandwich estimators for analysis of high-dimensional data using generalized estimating equations, Biometrics, 67, 116, 10.1111/j.1541-0420.2010.01438.x

Warton, 2004, A MANOVA statistic is just as powerful as distance-based statistics for multivariate abundances, Ecology, 85, 858, 10.1890/02-0419

Warton, 2012, Distance-based multivariate analyses confound location and dispersion effects, Methods Ecol Evol, 3, 89, 10.1111/j.2041-210X.2011.00127.x

Wright, 1934, The method of path coefficients, Ann Math Stat, 5, 161, 10.1214/aoms/1177732676

Yee, 2006, Constrained additive ordination, Ecology, 87, 203, 10.1890/05-0283

Zhou, 2013, Random sampling process leads to overestimation of β-diversity of microbial communities, mBio, 4, 10.1128/mBio.00324-13

Zinger, 2011, Global patterns of bacterial beta-diversity in seafloor and seawater ecosystems, PLoS ONE, 6, e24570, 10.1371/journal.pone.0024570

Zinger, 2012, Two decades of describing the unseen majority of aquatic microbial diversity, Mol Ecol, 21, 1878, 10.1111/j.1365-294X.2011.05362.x

Zou, 2006, Sparse principal component analysis, J Comput Graph Stat, 15, 265, 10.1198/106186006X113430

Zuur, 2010, A protocol for data exploration to avoid common statistical problems, Methods Ecol Evol, 1, 3, 10.1111/j.2041-210X.2009.00001.x