Effects of sample size on the performance of species distribution models

Diversity and Distributions - Tập 14 Số 5 - Trang 763-773 - 2008
Mary S. Wisz1, Robert J. Hijmans2, Jin Li3, A. Townsend Peterson4, Catherine H. Graham5, Antoine Guisan6
1Department of Arctic Environment, National Environmental Research Institute, University of Aarhus, Frederikborgvej 399, Roskilde, Denmark,
2International Rice Research Institute, Los Banos, Laguna, Philippines
3Department of Marine and Coastal Environment, Geoscience, Canberra, ACT, Australia,
4University of Kansas Natural History Museum and Biodiversity Research Center, Lawrence, KS, USA,
5Department of Ecology and Evolution, Stony Brook University, NY 11794, USA,
6Dept of Ecology and Evolution; Univ. of Lausanne; Lausanne Switzerland

Tóm tắt

ABSTRACT

A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence–absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS‐INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM‐GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size (n < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling.

Từ khóa


Tài liệu tham khảo

10.1046/j.1466-822X.2002.00275.x

10.1111/j.1365-2486.2005.01000.x

10.1111/j.1365-2664.2006.01164.x

10.1016/j.ecolmodel.2006.05.023

10.2307/2997444

Busby J.R., 1991, Nature conservation: cost effective biological surveys and data analysis, 64

10.1007/BF00051966

10.1111/j.1600-0587.1998.tb00405.x

Chambers J.M., 1983, Graphical methods for data analysis

Crawley M., 2002, Statistical computing: an introduction to data analysis using S‐plus

10.1046/j.1365-2699.2000.00408.x

10.1111/j.2006.0906-7590.04596.x

10.1111/j.0021-8901.2004.00881.x

Ferrier S.&Watson G.(1996)An evaluation of the effectiveness of environmental surrogates and modelling techniques in predicting the distribution of biological diversity.Consultancy report prepared by the NSW National Parks and Wildlife Service for Department of Environment Sport and Territories Canberra ACT Australia.

10.1023/A:1021302930424

10.1017/S0376892997000088

10.1214/aos/1016218223

10.1111/j.1365-2664.2007.01408.x

10.1016/j.tree.2004.07.006

10.1111/j.1523-1739.2006.00354.x

10.1016/S0304-3800(02)00204-1

10.1111/j.1472-4642.2007.00342.x

10.1111/j.1461-0248.2005.00792.x

10.1007/978-0-387-21606-5

Hepinstall J.A., 1997, Using Bayesian statistics, Thematic Mapper satellite imagery, and breeding bird survey data to model bird species probability of occurrence in Maine, Photogrammetric Engineering and Remote Sensing, 63, 1231

10.1111/j.0906-7590.2006.04700.x

10.1111/j.1365-2486.2006.01256.x

10.1111/j.1523-1739.2000.98543.x

10.1890/0012-9658(2002)083[2027:ENFAHT]2.0.CO;2

10.1101/SQB.1957.022.01.039

10.1890/1051-0761(2003)013[0853:ASAOFA]2.0.CO;2

10.1016/0169-5347(93)90259-R

10.1038/428799b

10.1111/j.1466-8238.2007.00358.x

10.1016/S0304-3800(99)00113-1

10.1016/S0304-3800(02)00197-7

10.1016/S0304-3800(99)00227-6

10.1111/j.1365-2699.2006.01460.x

10.1016/j.ecolmodel.2005.03.026

10.1038/nature02205

10.1890/03-5374

10.1111/j.0021-8901.2004.00903.x

10.1080/136588199241391

10.1016/S0304-3800(01)00388-X

10.1038/nature02121

10.1111/j.1365-2486.2004.00859.x

10.1111/j.1365-2699.2006.01661.x