Identification of Critical Flood Prone Areas in Data-Scarce and Ungauged Regions: A Comparison of Three Data Mining Models
Tóm tắt
Flood is one of the most devastating natural disasters with socio-economic consequences. Thus, preparation of the flood prone areas (FPA) map is essential for flood disaster management, and for planning further development activities. The main goal of this study is to investigate new applications of the evidential belief function (EBF), random forest (RF), and boosted regression trees (BRT) models for identifying the FPA in the Galikesh region, Iran. This research was conducted in three main stages such as data preparation, flood susceptibility mapping using EBF, RF, and BRT models and validation of constructed models using receiver operating characteristic (ROC) curve. At first, a flood inventory map was prepared using documentary sources of Iranian Water Resources Department (IWRD) and extensive field surveys. In total, 63 flood locations were identified in the study area. Of these, 47 (75%) floods were randomly selected as training/model building and the remaining 16 (25%) cases were used for the validation purposes. The flood conditioning factors considered in the study area are altitude, slope aspect, slope angle, topographic wetness index, plan curvature, geology, landuse, distance from rivers, drainage density, and soil texture. Subsequently, the FPA maps were prepared using EBF, RF, and BRT models in a GIS environment. Finally, the results were validated using ROC curve and area under the curve (AUC) analysis. From the analysis, it was seen that the EBF (AUC = 78.67%) and BRT models (AUC = 78.22%) performed better than RF model (AUC = 73.33%). Therefore, the resultant FPA maps can be useful for researchers and planner in flood mitigation strategies.
Tài liệu tham khảo
Abdolhay A, Saghafian B, Soom MAM, Ghazali AHB (2012) Identification of homogenous regions in Gorganrood basin (Iran) for the purpose of regionalization. Nat Hazards 61(3):1427–1442
Aertsen W, Kint K, Vos BD, Deckers J, Orshoven JV, Muys B (2012) Predicting forest site productivity in temperate lowland from forest floor, soil and litterfall characteristics using boosted regression trees. Plant Soil 354:157–172
Ahmadisharaf E, Kalyanapu AJ, Chung ES (2016a) Spatial probabilistic multi-criteria decision making for assessment of flood management alternatives. J Hydrol 533:365–378
Ahmadisharaf E, Tajrishy M, Alamdari N (2016b) Integrating flood hazard into site selection of detention basins using spatial multi-criteria decision-making. J Environ Plann Manag 59(8):1397–1417
Alvarado-Aguilar D, Jiménez JA, Nicholls RJ (2012) Flood hazard and damage assessment in the Ebro Delta (NW Mediterranean) to relative sea level rise. Nat Hazard 62:1301–1321
Breiman L (2001) Random forests. Mach Learn 45:5–32
Carranza EJM, Wibowo H, Barritt SD, Sumintadireja P (2008) Spatial data analysis and integration for regional-scale geothermal potential mapping, West Java, Indonesia. Geothermics 37:267–299
Cosby BJ, Hornberger GM, Clapp RB, Ginn TR (1984) A statistical exploration of the relationships of soil moisture characteristics to the physical properties of soils. Water Resour Res 20:682–690
Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88(11):2783–2792
Degiorgis M, Gnecco G, Gorni S, Roth G, Sanguineti M, Taramasso AC (2012) Classifiers for the detection of flood-prone areas using remote sensed elevation data. J Hydrol 470–471:302–315
Dempster AP (1967) Upper and lower probabilities induced by a multi valued mapping. Ann Math Stat 28:325–339
Dempster AP (1968) Generalization of Bayesian inference. J R Stat Soc: Ser B 30:205–247
Dempster A (2008) Upper and lower probabilities induced by a multivalued mapping. In: Shafer G, Yager R, Liu L, Dempster AP (eds) Classic works of the Dempster-Shafer theory of belief functions. Springer, Berlin
R Development Core Team (2009) R: a language for environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN 3900051007-0, http://www.R-project.org
Elith J, Leathwick JR, Hastie T (2008) A working guide to boosted regression trees. J Anim Ecol 77:802–813
Fernández DS, Lutz MA (2010) Urban flood hazard zoning in Tucumán Province, Argentina, using GIS and multicriteria decision analysis. Eng Geol 111:90–98
Foudi S, Osés-Eraso N, Tamayo I (2015) Integrated spatial flood risk assessment: the case of Zaragoza. Land Use Policy 42:278–292
Froeschke JT, Stunz GW, Wildhaber ML (2010) Environmental influences on the occurrence of coastal sharks in estuarine waters. Mar Ecol Prog Ser 407:279–292
García-Pintado J, Neal JC, Mason DC, Dance SL, Bates PD (2013) Scheduling satellite-based SAR acquisition for sequential assimilation of water level observations into flood modelling. J Hydrol 495:252–266
Ghanbarpour MR, Salimi S, Hipel KW (2013) A comparative evaluation of flood mitigation alternatives using GIS‐based river hydraulics modelling and multicriteria decision analysis. J Flood Risk Manag 6(4):319–331
Glenn E, Morino K, Nagler P, Murray R, Pearlstein S, Hultine K (2012) Roles of saltcedar (Tamarix spp.) and capillary rise in salinizing a non-flooding terrace on a flow-regulated desert river. J Arid Environ 79:56–65
Grabs T, Seibert J, Bishop K, Laudon H (2009) Modeling spatial patterns of saturated areas: a comparison of the topographic wetness index and a dynamic distributed model. J Hydrol 373:15–23
Gromping U (2009) Variable importance assessment in regression: linear regression versus random forest. Am Stat 63(4):308–319
Hastie LC, Boon PJ, Young MR, Way S (2001) The effects of a major flood on an endangered freshwater mussel population. Biol Conserv 98:107–115
Hastie TJ, Tibshirani RJ, Friedman JJH (2009) The elements of statistical learning. Springer, New York
Jakubcova A, Grežo H, Hreško J (2015) Identification of areas with significant flood risk at the confluence of Danube and Ipeĭ rivers (southern Slovakia). Nat Hazards 75:849–867
Kamat R (2015) Planning and managing earthquake and flood prone towns. Stoch Environ Res Risk Assess 29(2):527–545
Khosravi K, Nohani E, Maroufinia E, Pourghasemi HR (2016) A GIS-based flood susceptibility assessment and its mapping in Iran: a comparison between frequency ratio and weights-of-evidence bivariate statistical models with multi-criteria decision-making technique. Nat Hazards 83(2):947–987
Kia MB, Pirasteh S, Pradhan, Mahmud B, Sulaiman AR, Moradi WNAA (2012) An artificial neural network model for flood simulation using GIS: Johor River Basin Malaysia. Environ Earth Sci 67:251–264
Koks EE, Jongman B, Husby TG, Botzen WJW (2015) Combining hazard, exposure and social vulnerability to provide lessons for flood risk management. Environ Sci Policy 47:42–52
Lee MJ, Kang, JE, Jeon S (2012) Application of frequency ratio model and validation for predictive flooded area susceptibility mapping using GIS In: Geoscience and Remote Sensing Symposium (IGARSS). IEEE International. Munich 895–898.
Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2(3):18–22
Marfai MA, Sekaranom AB, Ward P (2015) Community responses and adaptation strategies toward flood hazard in Jakarta, Indonesia. Nat Hazards 75:1127–1144
Markantonis V, Meyer V, Lienhoop N (2013) Evaluation of the environmental impacts of extreme floods in the Evros river basin using contingent valuation method. Nat Hazards 69:1535–1549
Moore ID, Grayson RB, Ladson AR (1991) Digital terrain modeling: a review of hydrological, geomorphological and biological applications. Hydrol Pro 5:3–30
Negnevitsky M (2002) Artificial intelligence: a guide to intelligent systems. Addison–Wesley/Pearson, Harlow
Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343
Oliveira S, Oehler F, San-Miguel-Ayanz J, Camia A, Pereira JMC (2012) Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random forest. Forest. Ecol Manag 275:117–129
Omidvar B, Khodaei H (2008) Using value engineering to optimize flood forecasting and flood warning systems: Golestan and Golabdare watersheds in Iran as case studies. Nat Hazards 47:281–296
Papaioannou G, Vasiliades L, Loukas A (2015) multi-criteria analysis framework for potential flood prone areas mapping. Water Resour Manag 29(2):399–418
Pradhan B (2010) Flood susceptible mapping and risk area delineation using logistic regression, GIS and remote sensing. J Spat Hydrol 9:1–18
Rahmati O, Pourghasemi HR, Zeinivand H (2016a) Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto Int 31(1):42–70
Rahmati O, Zeinivand H, Besharat M (2016b) Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis. Geomatics, Nat Hazards Risk 7(3):1000–1017
Ridgeway G (2006) Generalized boosted models: a guide to the gbm package
Saghafian B, Farazjoo H, Bozorgy B, Yazdandoost F (2008) flood intensification due to changes in land use. Water Resour Manag 22:1051–1067
Shafer G (1976) A mathematical theory of evidence , vol. 1. Princeton University Press, Princeton
Stefanidis S, Stathis D (2013) Assessment of flood hazard based on natural and anthropogenic factors using analytic hierarchy process (AHP). Nat Hazards 68(2):569–585
Tehrany MS, Pradhan B, Jebur MN (2013) Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. J Hydrol 504:69–79
Tehrany MS, Pradhan B, Mansor S, Ahmad N (2015) Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. Catena 125:91–101
Tunusluoglu M, Gokceoglu C, Nefeslioglu H, Sonmez H (2008) Extraction of potential debris source areas by logistic regression technique: a case study from Barla, Besparmak and Kapi mountains (NW Taurids, Turkey). Environ Geol 54:9–22