Simple means to improve the interpretability of regression coefficients

Methods in Ecology and Evolution - Tập 1 Số 2 - Trang 103-113 - 2010
Holger Schielzeth1
1University of Uppsala -Sweden > > > >

Tóm tắt

Summary

1. Linear regression models are an important statistical tool in evolutionary and ecological studies. Unfortunately, these models often yield some uninterpretable estimates and hypothesis tests, especially when models contain interactions or polynomial terms. Furthermore, the standard errors for treatment groups, although often of interest for including in a publication, are not directly available in a standard linear model.

2. Centring and standardization of input variables are simple means to improve the interpretability of regression coefficients. Further, refitting the model with a slightly modified model structure allows extracting the appropriate standard errors for treatment groups directly from the model.

3. Centring will make main effects biologically interpretable even when involved in interactions and thus avoids the potential misinterpretation of main effects. This also applies to the estimation of linear effects in the presence of polynomials. Categorical input variables can also be centred and this sometimes assists interpretation.

4. Standardization (z‐transformation) of input variables results in the estimation of standardized slopes or standardized partial regression coefficients. Standardized slopes are comparable in magnitude within models as well as between studies. They have some advantages over partial correlation coefficients and are often the more interesting standardized effect size.

5. The thoughtful removal of intercepts or main effects allows extracting treatment means or treatment slopes and their appropriate standard errors directly from a linear model. This provides a simple alternative to the more complicated calculation of standard errors from contrasts and main effects.

6. The simple methods presented here put the focus on parameter estimation (point estimates as well as confidence intervals) rather than on significance thresholds. They allow fitting complex, but meaningful models that can be concisely presented and interpreted. The presented methods can also be applied to generalised linear models (GLM) and linear mixed models.

Từ khóa


Tài liệu tham khảo

Aiken L.S., 1991, Multiple Regression: Testing and Interpreting Interactions

10.1111/j.1558-5646.1984.tb00344.x

10.1037/1082-989X.8.2.129

10.3102/10769986031002157

10.1016/j.tree.2008.10.008

Bowerman B.L., 1990, Linear Statistical Models: An Appllied Approach

10.2307/2684719

10.2307/2685045

10.1016/S0169-5347(00)89117-X

10.1037/0033-2909.114.3.542

Burnham K.P., 2002, Model Selection and Multimodel Inference: A Practical Information‐Theoretic Approach

10.1080/15459620802225481

10.1016/j.anbehav.2005.01.016

Faraway J.J., 2005, Linear Models in R

10.1093/beheco/arp137

10.1214/009053604000001048

10.1002/sim.3107

Gelman A., 2007, Data Analysis Using Regression and Multilevel/Hierarchical Models

10.1002/sim.4780090609

10.1207/S15327906MBR3501_1

Johnson C.R., 1936, Tests of certain linear hypothesis and their appplication in some educational problems, Statistical Research Memoirs, 1, 57

10.2307/2111095

10.2307/2285760

10.1111/j.1469-185X.2007.00027.x

10.1111/j.1469-185X.2009.00083.x

Neter J., 1996, Applied Linear Statistical Models

10.1111/j.1420-9101.2007.01300.x

10.1086/503331

10.1016/j.anbehav.2008.11.006

10.1017/CBO9780511806384

R Development Core Team, 2009, R: A Language and Environment for Statistical Computing

10.1093/beheco/arn145

Shipley B., 2000, Cause and Correlation in Biology: A User’s Guide to Path Analysis, Strutural Equations and Causal Inference, 10.1017/CBO9780511605949

10.1073/pnas.94.2.549

10.1016/j.tree.2006.12.003

Tabachnick B.G., 2006, Unsing Multivariate Statistics

10.1093/genetics/3.4.367

Zar J.H., 1999, Biostatistical Analysis