Repeatability for Gaussian and non‐Gaussian data: a practical guide for biologists
Tóm tắt
Repeatability (more precisely the common measure of repeatability, the intra‐class correlation coefficient, ICC) is an important index for quantifying the accuracy of measurements and the constancy of phenotypes. It is the proportion of phenotypic variation that can be attributed to between‐subject (or between‐group) variation. As a consequence, the non‐repeatable fraction of phenotypic variation is the sum of measurement error and phenotypic flexibility. There are several ways to estimate repeatability for Gaussian data, but there are no formal agreements on how repeatability should be calculated for non‐Gaussian data (e.g. binary, proportion and count data). In addition to point estimates, appropriate uncertainty estimates (standard errors and confidence intervals) and statistical significance for repeatability estimates are required regardless of the types of data. We review the methods for calculating repeatability and the associated statistics for Gaussian and non‐Gaussian data. For Gaussian data, we present three common approaches for estimating repeatability: correlation‐based, analysis of variance (ANOVA)‐based and linear mixed‐effects model (LMM)‐based methods, while for non‐Gaussian data, we focus on generalised linear mixed‐effects models (GLMM) that allow the estimation of repeatability on the original and on the underlying latent scale. We also address a number of methods for calculating standard errors, confidence intervals and statistical significance; the most accurate and recommended methods are parametric bootstrapping, randomisation tests and Bayesian approaches. We advocate the use of LMM‐ and GLMM‐based approaches mainly because of the ease with which confounding variables can be controlled for. Furthermore, we compare two types of repeatability (ordinary repeatability and extrapolated repeatability) in relation to narrow‐sense heritability. This review serves as a collection of guidelines and recommendations for biologists to calculate repeatability and heritability from both Gaussian and non‐Gaussian data.
Từ khóa
Tài liệu tham khảo
Becker W. A.(1992).A manual of quantitative genetics 5th edition. Academic Enterprises Pullman WA.
Biro P. A., 2010, Proceedings of the Royal Society B‐Biological Sciences, 71
Carrasco J. L.(2009).A generalized concordance correlation coefficient based on the variance components generalized linear mixed models with application to overdispersed count data.Biometrics in press DOI:10.1111/j.1541-0420
DeWitt T. J., 2004, Phenotypic plasticity: functional and conceptual approaches.
Dingemanse N. J., 2009, Behavioural reaction norms: animal personality meets individual plasticity, Trends in Ecology & Evolution, 25, 82
Falconer D. S., 1996, Introduction to quantitative genetics
Faraway J. J., 2006, Extending the linear model
Gelman A., 2007, Data analysis using regression and multilevel/hierarchical models.
Hadfield J. D., 2010, MCMC methods for multi‐response Generalised Linear Mixed Models: the MCMCglmm R package, Journal of Statistical Software, 33, 10.18637/jss.v033.i02
Lee Y. Nelder J. A.&Pawitan Y.(2006).Generalized linear models with random effects: unified analysis via H‐likelihood. Chapman & Hall/CRC Boca Raton FL.
Littell R. C. Milliken G. A. Stroup W. W. Wolfinger R. D.&Schabenberger O.(2006).SAS®for Mixed Models. SAS Institue Inc. Cary NC.
Lynch M., 1998, Genetics and analysis of quantitative traits.
Manly B. R. J., 2006, Randomization, Bootstrap and Monte carlo Methods in Biology
McCulloch C. E., 2002, Generalized, linear and mixed models.
Merilä J., 2000, Avian quantitative genetics., Current Ornithology, 9, 179
R Development CoreTeam, 2009, R: A language and environment for statistical computing
Snijders T. A. B., 1999, Multilevel analysis: an introduction to basic and advanced multilevel modeling.
Verbeke G., 2001, Linear Mixed Models for Longitudinal Data