Methods to account for spatial autocorrelation in the analysis of species distributional data: a review
Tóm tắt
Species distributional or trait data based on range map (extent‐of‐occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species’ distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method's implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing spatial autocorrelation in the errors. However, we found that for presence/absence data the results and conclusions were very variable between the different methods. This is likely due to the low information content of binary maps. Also, in contrast with previous studies, we found that autocovariate methods consistently underestimated the effects of environmental controls of species distributions. Given their widespread use, in particular for the modelling of species presence/absence data (e.g. climate envelope models), we argue that this warrants further study and caution in their use. To aid other ecologists in making use of the methods described, code to implement them in freely available software is provided in an electronic appendix.
Từ khóa
Tài liệu tham khảo
Anon. 2005. R: a language and environment for statistical computing. – R Foundation for Statistical Computing.
Augustin N. H., 2005, Analyzing the spread of beech canker, For. Sci., 51, 438
Besag J., 1974, Spatial interaction and the statistical analysis of lattice systems, J. Roy. Stat. Soc. B, 36, 192
Bivand R. 2005. spdep: spatial dependence: weighting schemes statistics and models. – R package version 0.3–17.
Carey V. J. 2002. gee: generalized estimation equation solver. Ported to R by Thomas Lumley (ver. 3.13 4.4) and Brian Ripley. – <www.r‐project.org>.
Carl G. and Kühn I. 2007b. Analyzing spatial ecological data using linear regression and wavelet analysis. – Stochast. Environ. Res. Risk Assess. in press.
Cliff A. D., 1981, Spatial processes: models and applications
Diggle P. J., 1995, Analysis of longitudinal data
Dobson A. J., 2002, An introduction to generalized linear models
Fotheringham A. S., 2002, Geographically weighted regression: the analysis of spatially varying relationships
Hastie T. J., 1990, Generalized additive models
Isaaks E. H., 1989, An introduction to applied geostatistics
Kaluzny S. P., 1998, S‐plus spatial stats user's manual for Windows and Unix
Kissling W. D. and Carl G. 2007. Spatial autocorrelation and the selection of simultaneous autoregressive models. – Global Ecol. Biogeogr. in press.
Klute D. S., 2002, Predicting species occurrences: issues of accuracy and scale, 335
Legendre P., 1998, Numerical ecology
Littell R. C., 1996, SAS system for mixed lodels
McPherson J. M., 2007, Effects of species’ ecology on the accuracy of distribution models, Ecography, 30, 135
Myers R. H., 2002, Generalized linear models
Osborne P. E. et al. 2007. Non‐stationarity and local approaches to modelling the distributions of wildlife. – Div. Distribut. in press.
Teterukovskiy A., 2003, Effective field sampling for predicting the spatial distribution of reindeer (Rangifer tarandus) with help of the Gibbs sampler, Ambio, 32, 568, 10.1579/0044-7447-32.8.568
Wu H. L., 1997, Modelling the distribution of plant species using the autologistic regression model, Environ. Ecol. Stat., 4, 49, 10.1023/A:1018505924603
Yan J., 2002, geepack: yet another package for generalized estimating equations, R News, 2, 12
Yan J. 2004. geepack: generalized estimating equation package. – R package version 0.2–10.