Consensus methods based on machine learning techniques for marine phytoplankton presence–absence prediction

Ecological Informatics - Tập 42 - Trang 46-54 - 2017
M. Bourel1,2, C. Crisci3, A. Martínez4
1Instituto de Matemática y Estadística Prof. Ing. Rafael Laguardia, Facultad de Ingeniería, Julio Herrera y Reissig 565, CP 11200 Montevideo, Uruguay
2Departamento Métodos Matemático Cuantitativos, Facultad de Ciencias Económicas y Administración, Universidad de la República, Av. Gonzalo Ramírez 1926, CP 11200 Montevideo, Uruguay
3Centro Universitario Regional del Este, Universidad de la República, Ruta Nacional n9 y Ruta n15, CP 27000 Rocha, Uruguay
4Dirección Nacional de Recursos Acuáticos, M.G.A.P., Puerto de La Paloma, CP 27001 La Paloma, Rocha, Uruguay

Tài liệu tham khảo

Alexandre, 2000, Combining independent and unbiased classifiers using weighted average, 2495 Andersen, 2003, Estimating Cell Numbers, 99 Anderson, 2010, Predicting potentially toxigenic pseudo-Nitzschia blooms in the chesapeake Bay, J. Mar. Syst., 83, 127, 10.1016/j.jmarsys.2010.04.003 Araújo, 2007, Ensemble forecasting of species distributions, Trends Ecol. Evol., 22, 42, 10.1016/j.tree.2006.09.010 Bergamino, 2016, Trophic niche shifts driven by phytoplankton in sandy beach ecosystems, Estuar. Coast. Shelf Sci., 180, 33, 10.1016/j.ecss.2016.06.023 Bourel, 2012, Model aggregation methods and applications, Mem. Trab. difusión Cient. Tec., 10, 19 Bourel, 2013, Apprentissage statistique par aggregation de modeles Breiman, 1996, Bagging predictors, Mach. Learn., 24, 123, 10.1007/BF00058655 Breiman, 1996, Stacked regression, Mach. Learn., 24, 49, 10.1007/BF00117832 Breiman, 1998, Arcing classifiers, Ann. Stat., 26, 801 Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324 Brotons, 2004, Presence–absence versus presence-only modelling methods for predicting bird habitat suitability, Ecography, 27, 437, 10.1111/j.0906-7590.2004.03764.x Brown, 2005, Diversity creation methods: a survey and categorisation, IEEE Circuits Syst. Mag., 6, 5 Campbell, 1996, The global distribution of surf diatom accumulations, Rev. Chil. Hist. Nat., 69, 495 Comte, 2013, Species distribution modelling and imperfect detection: comparing occupancy versus consensus methods, Divers. Distrib., 19, 996, 10.1111/ddi.12078 Crisci, 2012, A review of supervised machine learning algorithms and their applications to ecological data, Ecol. Model., 240, 113, 10.1016/j.ecolmodel.2012.03.001 Cutler, 2007, Random forests for classification in ecology, Ecology, 88, 2783, 10.1890/07-0539.1 De’ath, 2007, Boosted trees for ecological modeling and prediction, Ecology, 88, 243, 10.1890/0012-9658(2007)88[243:BTFEMA]2.0.CO;2 De’ath, 2000, Classification and regression trees: a powerful yet simple technique for ecological data analysis, Ecology, 81, 3178, 10.1890/0012-9658(2000)081[3178:CARTAP]2.0.CO;2 Devroye, 1997 Drake, 2006, Modelling ecological niches with support vector machines, J. Appl. Ecol., 43, 424, 10.1111/j.1365-2664.2006.01141.x Džeroski, 2004, Is combining classifiers with stacking better than selecting the best one?, Mach. Learn., 54, 255, 10.1023/B:MACH.0000015881.36452.6e Elith, 2006, Novel methods improve prediction of species distributions from occurrence data, Ecography, 29, 129, 10.1111/j.2006.0906-7590.04596.x Freund, 1997, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci., 55, 119, 10.1006/jcss.1997.1504 Fumera, 2005, A theoretical and experimental analysis of linear combiners for multiple classifier systems, IEEE Trans. Pattern Anal. Mach. Intell., 27, 942, 10.1109/TPAMI.2005.109 Guisan, 2002, Generalized linear and generalized additive models in studies of species distributions: setting the scene, Ecol. Model., 157, 89, 10.1016/S0304-3800(02)00204-1 Guisan, 2005, Predicting species distribution: offering more than simple habitat models, Ecol. Lett., 8, 993, 10.1111/j.1461-0248.2005.00792.x Guo, 2005, Support vector machines for predicting distribution of sudden oak death in California, Ecol. Model., 182, 75, 10.1016/j.ecolmodel.2004.07.012 Hastie, 2001, 10.1007/978-0-387-21606-5 James, 2014 Jeong, 1999, The ecological roles of heterotrophic dinoflagellates in marine planktonic community, J. Eukaryot. Microbiol., 46, 390, 10.1111/j.1550-7408.1999.tb04618.x Jeong, 2013, Red tides in Masan Bay, Korea in 2004-2005: i. Daily variations in the abundance of red-tide organisms and environmental factors, Harmful Algae, 30, S75, 10.1016/j.hal.2013.10.008 Jeong, 2015, A hierarchy of conceptual models of red-tide generation: nutrition, behavior, and biological interactions, Harmful Algae, 47, 97, 10.1016/j.hal.2015.06.004 Jeong, 2001, Prediction and elucidation of phytoplankton dynamics in the Nakdong River (Korea) by means of a recurrent artificial neural network, Ecol. Model., 146, 115, 10.1016/S0304-3800(01)00300-3 Kampichler, 2010, Classification in conservation biology: a comparison of five machine-learning methods, Eco. Inform., 5, 441, 10.1016/j.ecoinf.2010.06.003 Kim, 2016, Killing potential protist predators as a survival strategy of the newly described dinoflagellate Alexandrium pohangense, Harmful Algae, 55, 41, 10.1016/j.hal.2016.01.009 Kruk, 2012, The habitat template of phytoplankton morphology-based functional groups, Hydrobiologia, 698, 191, 10.1007/s10750-012-1072-6 Kuhn, 2013 Kuncheva, 2014 Lane, 2009, Development of a logistic regression model for the prediction of toxigenic pseudo-Nitzschia blooms in Monterey Bay, California, Mar. Ecol. Prog. Ser., 383, 37, 10.3354/meps07999 Lauzeral, 2015, The iterative ensemble modelling approach increases the accuracy of fish distribution models, Ecography, 38, 213, 10.1111/ecog.00554 Lee, 2003, Neural network modelling of coastal algal blooms, Ecol. Model., 159, 179, 10.1016/S0304-3800(02)00281-8 Lichman, 2013 Lobo, 2008, AUC: a misleading measure of the performance of predictive distribution models, Glob. Ecol. Biogeogr., 17, 145, 10.1111/j.1466-8238.2007.00358.x Manel, 2001, Evaluating presence–absence models in ecology: the need to account for prevalence, J. Appl. Ecol., 38, 921, 10.1046/j.1365-2664.2001.00647.x Marmion, 2009, Statistical consensus methods for improving predictive geomorphology maps, Comput. Geosci., 35, 615, 10.1016/j.cageo.2008.02.024 Marmion, 2009, Evaluation of consensus methods in predictive species distribution modelling, Divers. Distrib., 15, 59, 10.1111/j.1472-4642.2008.00491.x Masoudnia, 2014, Mixture of experts: a literature survey, Artif. Intell. Rev., 42, 275, 10.1007/s10462-012-9338-y McGillicuddy, 2010, Models of harmful algal blooms: conceptual, empirical, and numerical approaches, J. Mar. Syst., 83, 105, 10.1016/j.jmarsys.2010.06.008 Medvinsky, 2002, Spatiotemporal complexity of plankton and fish dynamics, SIAM Rev., 44, 311, 10.1137/S0036144502404442 Moisen, 2006, Predicting tree species presence and basal area in Utah: a comparison of stochastic gradient boosting, generalized additive models, and tree-based methods, Ecol. Model., 199, 176, 10.1016/j.ecolmodel.2006.05.021 Moore, 2008, Impacts of climate variability and future climate change on harmful algal blooms and human health, Environ. Health, 7, S4, 10.1186/1476-069X-7-S2-S4 Odebrecht, 2014, Surf zone diatoms: a review of the drivers, patterns and role in sandy beaches food chains, Estuar. Coast. Shelf Sci., 150, 24, 10.1016/j.ecss.2013.07.011 Olden, 2002, A comparison of statistical approaches for modelling fish species distributions, Freshw. Biol., 47, 1976, 10.1046/j.1365-2427.2002.00945.x Olden, 2008, Machine learning methods without tears: a primer for ecologists, Q. Rev. Biol., 83, 171, 10.1086/587826 Ortega, 2007, Multiannual and seasonal variability of water masses and fronts over the Uruguayan Shelf, J. Coast. Res., 23, 618, 10.2112/04-0221.1 Polikar, 2006, Ensemble based systems in decision making, IEEE Circuits Syst. Mag., 6, 21, 10.1109/MCAS.2006.1688199 Core Team, 2016, R: A Language and Environment for Statistical Computing Recknagel, 1998, Modelling and prediction of phyto-and zooplankton dynamics in Lake Kasumigaura by artificial neural networks, Lakes Reserv. Res. Manag., 3, 123, 10.1111/j.1440-1770.1998.tb00039.x Richardson, 2003, A dynamic quantitative approach for predicting the shape of phytoplankton profiles in the ocean, Prog. Oceanogr., 59, 301, 10.1016/j.pocean.2003.07.003 Scardi, 1999, Developing an empirical model of phytoplankton primary production: a neural network case study, Ecol. Model., 120, 213, 10.1016/S0304-3800(99)00103-9 Schapire, 1998, Boosting the margin: a new explanation for the effectiveness of voting methods, Ann. Stat., 26, 322 Smayda, 2002, Adaptive ecology, growth strategies and the global bloom expansion of dinoflagellates, J. Oceanogr., 58, 281, 10.1023/A:1015861725470 Talbot, 1990, A review of the ecology of surf-zone diatoms, with special reference to Anaulus australis, Oceanogr. Mar. Biol. Annu. Rev., 28, 155 Thuiller, 2004, Patterns and uncertainties of species' range shifts under climate change, Glob. Chang. Biol., 10, 2020, 10.1111/j.1365-2486.2004.00859.x Thuiller, 2009, BIOMOD — a platform for ensemble forecasting of species distributions, Ecography, 32, 369, 10.1111/j.1600-0587.2008.05742.x Thuiller, 2005, Climate change threats to plant diversity in Europe, Proc. Natl. Acad. Sci. U. S. A., 102, 8245, 10.1073/pnas.0409902102 Ting, 1999, Issues in stacked generalization, J. Artif. Intell. Res., 10, 271, 10.1613/jair.594 Todorovski, 2000, Combining Multiple Models with Meta Decision Trees Principles of Data Mining and Knowledge Discovery, 10.1007/3-540-45372-5_6 Utermöhl, 1958 Vapnik, 1995 Vilas, 2014, Support vector machine-based method for predicting pseudo-Nitzschia spp. blooms in coastal waters (Galician Rias, NW Spain), Prog. Oceanogr., 124, 66, 10.1016/j.pocean.2014.03.003 Wilson, 2001, Towards a generic artificial neural network model for dynamic predictions of algal abundance in freshwater lakes, Ecol. Model., 146, 69, 10.1016/S0304-3800(01)00297-6 Wolpert, 1992, Stacked generalization, Neural Netw., 5, 241, 10.1016/S0893-6080(05)80023-1