A practical guide to MaxEnt for modeling species' distributions: what it does, and why inputs and settings matter

Ecography - Tập 36 Số 10 - Trang 1058-1069 - 2013
Cory Merow1,2, Matthew J. Smith1, John A. Silander2
1Computational Ecology and Environmental Science Group, Computational Science Laboratory, Microsoft Research Ltd. 21 Station Road Cambridge CB1 2FB UK
2Univ. of Connecticut, Ecology and Evolutionary Biology 75 North Eagleville Rd. Storrs CT 06269 USA

Tóm tắt

The MaxEnt software package is one of the most popular tools for species distribution and environmental niche modeling, with over 1000 published applications since 2006. Its popularity is likely for two reasons: 1) MaxEnt typically outperforms other methods based on predictive accuracy and 2) the software is particularly easy to use. MaxEnt users must make a number of decisions about how they should select their input data and choose from a wide variety of settings in the software package to build models from these data. The underlying basis for making these decisions is unclear in many studies, and default settings are apparently chosen, even though alternative settings are often more appropriate. In this paper, we provide a detailed explanation of how MaxEnt works and a prospectus on modeling options to enable users to make informed decisions when preparing data, choosing settings and interpreting output. We explain how the choice of background samples reflects prior assumptions, how nonlinear functions of environmental variables (features) are created and selected, how to account for environmentally biased sampling, the interpretation of the various types of model output and the challenges for model evaluation. We demonstrate MaxEnt's calculations using both simplified simulated data and occurrence data from South Africa on species of the flowering plant family Proteaceae. Throughout, we show how MaxEnt's outputs vary in response to different settings to highlight the need for making biologically motivated modeling decisions.

Từ khóa


Tài liệu tham khảo

Methods Ecol. Evol 2012

Anderson R. P. 2012. Harnessing the world'fs biodiversity data: promise and peril in ecological niche modeling of species distributions. Ann. N. Y. Acad. Sci. 1260: 6680.

Anderson R. and Raza A. 2010. The effect of the extent of the study region on GIS models of species geographic distributions and estimates of niche evolution: preliminary tests with montane rodents (genus Nephelomys) in Venezuela. J. Biogeogr. 37: 13781393.

Anderson R. P. and Gonzalez I. 2011. Species‐specific tuning increases robustness to sampling bias in models of species distributions: an implementation with MaxEnt. Ecol. Model. 222: 27962811.

Araujo M. and Peterson A. 2012. Uses and misuses of bioclimatic envelope modelling. Ecology in press.

Austin M. 2002. Spatial prediction of species distribution: an interface between ecological theory and statistical modelling. Ecol. Model. 157: 101118.

Austin M. 2007. Species distribution models and ecological theory: a critical assessment and some possible new approaches. Ecol. Model. 200: 119.

Baasch D. etal. 2010. An evaluation of three statistical methods used to model resource selection. Ecol. Model. 221: 565574.

Barve N. etal. 2011. The crucial role of the accessible area in ecological niche modeling and species distribution modeling. Ecol. Model. 222: 18101819.

Burnham K. and Anderson D. R. 2002. Model selection and multimodel inference: a practical information‐theoretic approach 2nd ed. Springer.

Chakraborty A. etal. 2011. Point pattern modelling for degraded presence‐only data over large regions. J. R. Stat. Soc. C 60: 757776.

Cressie N. A. C. 1993. Spatial statistics for spatial data. Wiley‐Interscience.

Dudik M. and Phillips S. 2009. Generative and discriminative learning with unknown labeling bias. Adv. Neural Inform. Process. Syst. 21: 18.

Dudik M. etal. 2004. Performance guarantees for regularized maximum entropy density estimation. Learn. Theory Proc. 3120: 472486.

Dudik M. etal. 2005. Correcting sample selection bias in maximum entropy density estimation. Adv. Neural Inform. Process. Syst. 17: 18.

Elith J. etal. 2010. The art of modelling range‐shifting species. Methods Ecol. Evol. 1: 330342.

Elith J. etal. 2011. A statistical explanation of MaxEnt for ecologists. Divers. Distrib. 17: 4357.

Fithian W. and Hastie T. 2012. Finite‐sample equivalence of several statistical models for presence‐only data. <http://arxiv.org/abs/1207.6950v1>.

Giovanelli J. G. R. etal. 2010. Modeling a spatially restricted distribution in the Neotropics: how the size of calibration area affects the performance of five presence‐only methods. Ecol. Model. 221: 215224.

Goodman J. 2003. Exponential priors for maximum entropy models. Technical report Microsoft Research.

Graham C. etal. 2004. New developments in museum‐based informatics and applications in biodiversity analysis. Trends Ecol. Evol. 19: 497503.

Hastie T. etal. 2009. The elements of statistical learning: data mining inference and prediction. Springer.

He F. 2010. Maximum entropy logistic regression and species abundance. Oikos 119: 578582.

Hernandez P. A. etal. 2006. The effect of sample size and species characteristics on performance of different species distribution modeling methods. Ecography 29: 773785.

Hijmans R. J. 2012. Cross‐validation of species distribution models: removing spatial sorting bias and calibration with a null model. Ecology 93: 679688.

Hijmans R. J. etal. 2005. Very high resolution interpolated climate surfaces for global land areas. Int. J. Climatol. 25: 19651978.

Jaynes E. 2003. Probability theory: the logic of science. Cambridge Univ. Press.

Johnson C. J. etal. 2006. Resource selection functions based on use‐availability data: theoretical motivation and evaluation methods. J. Wildl. Manage. 70: 347357.

Keating K. and Cherry S. 2004. Use and interpretation of logistic regression in habitat‐selection studies. J. Wildl. Manage. 68: 774789.

Kery M. etal. 2010. Predicting species distributions from checklist data using site‐occupancy models. J. Biogeogr. 37: 18511862.

Latimer A. etal. 2006. Building statistical models to analyze species distributions. Ecol. Appl. 16: 3350.

Lele S. R. and Keim J. L. 2006. Weighted distributions and estimation of resource selection probability functions. Ecology 87: 30213028.

Linder H. 2005. Evolution of diversity: the Cape flora. Trends Plant Sci. 10: 536541.

Liu C. etal. 2005. Selecting thresholds of occurrence in the prediction of species distributions. Ecography 28: 385393.

Liu C. etal. 2010. Measuring and comparing the accuracy of species distribution models with presenceabsence data. Ecography 34: 232243.

Lobo J. etal. 2008. AUC: a misleading measure of the per formance of predictive distribution models. Global Ecol. Biogeogr. 17: 145151.

Manly B. F. J. etal. 2002. Resource selection by animals: statistical analysis and design for field studies 2nd ed. Kluwer.

Newbold T. etal. 2010. Testing the accuracy of species distribution models using species records from a new field survey. Oikos 119: 13261334.

Peterson A. T. etal. 2011. Ecological niches and geographic distributions. Princeton Univ. Press.

Phillips S. and Dudik M. 2008. Modeling of species distributions with MaxEnt: new extensions and a comprehensive evaluation. Ecography 31: 161.

Phillips S. and Elith J. 2010. POC plots: calibrating species distribution models with presence‐only data. Ecology 91: 24762484.

Phillips S. etal. 2006. Maximum entropy modeling of species geographic distributions. Ecol. Model. 190: 231259.

Phillips S. etal. 2009. Sample selection bias and presence‐only distribution models: implications for background and pseudo‐absence data. Ecol. Appl. 19: 181197.

Ponder W. etal. 2001. Evaluation of museum collection data for use in biodiversity assessment. Conserv. Biol. 15: 648657.

Raes N. etal. 2009. Botanical richness and endemicity patterns of Borneo derived from species distribution models. Ecography 32: 180192.

Rebelo T. 2001. A field guide to the Proteas of southern Africa. Fernwood Press.

Rebelo T. 2002. The Protea Atlas Project technical report. <http://protea.worldonline.co.za/default.htm>.

Reddy S. and Dvalos L. 2003. Geographical sampling bias and its implications for conservation priorities in Africa. J. Biogeogr. 30: 17191727.

Renner I. W. and Warton D. I. 2012. Equivalence of MAXENT and Poisson point process models for species distribution modeling in ecology. Biometrics in press.

Royle J. A. etal. 2012. Likelihood analysis of species occurrence probability from presence‐only data for modelling species distributions. Methods Ecol. Evol. in press.

Sastre P. and Lobo J. M. 2009. Taxonomist survey biases and the unveiling of biodiversity patterns. Biol. Conserv. 142: 462467.

Schulze R. 1997. South African atlas of agrohyrdology and climatology. Tech. Rep. Report TT82/96 Water Research Commission Pretoria South Africa

Syfert M. etal. 2013. Accounting for sampling bias can dramatically improve the predictive accuracy of presence‐only species distribution models. PloS One in press.

Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58: 267288.

VanDerWal J. etal. 2009. Selecting pseudo‐absence data for presence‐only distribution modeling: how far should you stray from what you know? Ecol. Model. 220: 589594.

Ward G. etal. 2009. Presence‐only data and the EM algorithm. Biometrics 65: 554563.

Warren D. and Seifert S. 2011. Ecological niche modeling in MaxEnt: the importance of model complexity and the performance of model selection criteria. Ecol. Appl. 21: 335342.

Warren D. etal. 2010. ENMTools: a toolbox for comparative studies of environmental niche models. Ecography 33: 607611.

Warton D. I. and Shepherd L. C. 2010. Poisson point process models solve the “pseudo‐absence problem” for presence‐only data in ecology. Ann. Appl. Stat. 4: 13831402.

Webber B. L. etal. 2011. Modelling horses for novel climate courses: insights from projecting potential distributions of native and alien Australian acacias with correlative and mechanistic models. Divers. Distrib. 17: 9781000.

Wenger S. J. and Olden J. D. 2012. Assessing transferability of ecological models: an underappreciated aspect of statistical validation. Methods Ecol. Evol. in press.

Wisz M. S. and Guisan A. 2009. Do pseudo‐absence selection strategies influence species distribution models and their predictions? An information‐theoretic approach based on simulated data. BMC Ecol. 9: 8.

Yackulic C. B. etal. 2012. Presence‐only modelling using MAXENT: when can we trust the inferences? Methods Ecol. Evol. in press.

Yates C. J. etal. 2010. Assessing the impacts of climate change and land transformation on Banksiain the South West Australian Floristic Region. Divers. Distrib. 16: 187201.