Random Forest and Logistic Regression algorithms for prediction of groundwater contamination using ammonia concentration
Tóm tắt
The present study aims to develop an efficient predictive model for groundwater contamination using Multivariate Logistic Regression (MLR) and Random Forest (RF) algorithms. Contamination by ammonia is recorded by many authors at Sohag Governorate, Egypt and is attributed to urban growth, agricultural, and industrial activities. Thirty-two groundwater samples representing the Quaternary aquifer are collected and analyzed for major cations (Ca, Mg, and Na), ammonia, nitrate, phosphate, and heavy metals. Lead, magnesium, iron, and zinc variables are used to test the model with ammonia which is used as an index to the groundwater contamination. Spatial distribution maps and statistical analyses show a strong correlation of ammonia with lead and magnesium variables whereas iron and zinc show less correlation. For Random Forest (RF) model, the data is divided into 70% training and 30% testing subsets. The performance of the model is evaluated using the classification reports, and the confusion matrix. Results show (1) high performance of RF model to groundwater contamination with an accuracy of 93% and (2) the MLR accuracy increased from 70 to 83% when “SOLVER” and “C” parameters are modified. The study helps to identify the contaminated zones at the study area and proved the usefulness of the machine learning models for prediction of the groundwater contamination using the ammonia concentration.
Tài liệu tham khảo
Abdel Latif A, El Kashouty M (2010) Groundwater investigation in Awlad Salameh, Southern Sohag, Upper Egypt. Earth Sci Res J 14(1):63–75
Abdel Moneim AA (1992) Numerical simulation and groundwater management of the Sohag aquifer. In: The Nile Valley, Upper Egypt. Ph.D. thesis. University of Strathclyde, Glasgow, Scotland
Abdel Moneim AA (1999) Geoelectrical and hydrogeological investigations of the groundwater resources on the area to the west of the cultivated land at Sohag, Upper Egypt. Egypt J Geol 43(2):253–268
Ahmed A (2009) Using generic and pesticide DRASTIC GIS-based models for vulnerability assessment of the Quaternary aquifer at Sohag, Egypt. Hydrogeol J 17:1203–1217
Ahmed A, Ali M (2011) Hydrochemical evolution and variation of groundwater and its environmental impact at Sohag, Egypt. Arab J Geosci 4(3):339–352
Aldhyani T, Al-Yaari M, Alkahtani H, Maashi M (2020) Water quality prediction using artificial intelligence algorithms. Appl Bionics Biomech 2020:6659314, 12 pages. https://doi.org/10.1155/2020/6659314
APHA (2005) Standard methods for the examination of water and wastewater, 21st edn. American Public Health Association, Washington, DC
Awad MA, Nada AA, Hamza MS, Froehlich K (1995) Chemical and isotopic investigation of groundwater in Tahta region, Sohag-Egypt. Environ Geochem Health 17:147–153
Balamurugan P, Kumar PS, Shankar K, Nagavinothini R, Pauline Selvaraj P (2020) A GIS-based evaluation of hydrochemical characterization of groundwater in hard rock region, South Tamil Nadu, India, Arab. J Geosci 13:837
Balamurugan P, Karuppannan S, Muniraj K (2021) Evaluation of drinking and irrigation suitability of groundwater with special emphasizing the health risk posed by nitrate contamination using nitrate pollution index (NPI) and human health risk assessment (HHRA). Human Ecol Risk Assess: An Int J 27:5
Bekkar M, Akrouf Alitouche Taklit (2013) Imbalanced data learning approaches review. Int J Data Min Knowl Manag Process (IJDKP) 03(04):15–33
Böhlke JK, Smith RL, Miller DN (2006) Ammonium transport and reaction in contaminated groundwater: application of isotope tracers and isotope fractionation studies. Water Resour Res 42:W05411. https://doi.org/10.1029/2005WR004349
Bottenberg RA, Ward JH (1963) Applied multiple linear regression, PRL-TDR-63-6, AD-413 128, lackland AFB, TX: 6570 Personnel Research Laboratory, Aerospace Medical Division
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman & Hall/CRC, London
Breiman L (2001) Random forests. Mach Learn 45:5–32
Cracknell MJ (2014) Machine learning for geological mapping: algorithms and applications. Ph.D. University of Tasmania, Tasmania
EGSMA (1983) Geological map of Egypt (1:250000)
Elbeih SF, El-Zeiny AM (2018) Qualitative assessment of groundwater quality based on land use spectral retrieved indices: case study Sohag Governorate, Egypt. Remote Sens Appl: Soc Environ 10:82–92
Gedamy YR (2015) Hydrochemical characteristics and pollution potential of groundwater in the reclaimed lands at the desert fringes, West of Sohag Governorate – Egypt. Curr Sci Int 4:288–312
Gislason PO, Benediktsson JA, Sveinsson JR (2006) Random forests for land cover classification. Pattern Recognit Lett 27 (4):294–300
Hagage M (2021) Impacts of anthropogenic activities on the deterioration of groundwater and archaeological sites in Akhmim area, Sohag Governorate, Egypt: remote sensing and GIS applications, M.Sc. Cairo University Egypt, Cairo
Hagage M, Madani A, Elbeih S, Faid A, El-Kammar A (2021) Groundwater quality and its suitability for drinking and irrigation in Akhmim District, Sohag Governorate, Egypt. Ann Geol Survey 38(7):118–133
Hamilton H (2012). Confusion matrix, Knowledge Discovery in Databases
Hosseini FS, Malekian A, Choubin B, Rahmati O, Cipullo S, Coulon F, Pradhan BA (2018) Novel machine learning-based approach for the risk assessment of nitrate groundwater contamination. Sci Total Environ 644:954–962
Ismaila E, El-Rawyba M (2018) Assessment of groundwater quality in West Sohag, Egypt. Desalin Water Treatment 5:1–8
Issawi B, Hinnawi M (1980) Contribution to the geology of the plain west of the Nile between Aswan and Kom Ombo. In: Close AE (ed) Loaves and fishes. Southern Methodist University Press, Texas, pp 311–330
Issawi B, Hassan MW, Osman R (1978) Geological studies in the area of Kom Ombo, Eastern Desert, Egypt. Ann Geol Survey 8:187–235
Karavoltsos S, Sakellari A, Mihopoulos N, Dassenakis M, Scoullos MJ (2008) Evaluation of the quality of drinking water in regions of Greece. Desalination 224(1-3):317–329
Krishna K, Kurakalva RM (2014) Risk assessment of heavy metals and their source distribution in waters of a contaminated industrial site. Environ Sci Pollut Res 21:3653–3669
Lindenbaum J (2012) Identification of sources of ammonium in groundwater using stable nitrogen and boron isotopes in Nam Du, Hanoi. M.Sc
Madani AA, Niyazi B (2015) Groundwater potential mapping using remote sensing techniques and weights of evidence GIS model: a case study from Wadi Yalamlam basin, Makkah Province, Western Saudi Arabia. Environ Earth Sci 74(6):5129–5142
Mair A, El-Kadi AI (2013) Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA. J Contam Hydrol 153:1–23
Melegy AA, Shaban AM, Hassaan MM, Salman S (2014) Geochemical mobilization of some heavy metals in water resources and their impact on human health in Sohag Governorate, Egypt. Arab J Geosci 7:4541–4552
Muharemia F, Logofătua D, Leonb F (2019) Machine learning approaches for anomaly detection of water quality on a real-world data set. J Inform Telecommun 3(3):294–307. https://doi.org/10.1080/24751839.2019.1565653
Nafouanti MB, Li J, Mustapha NA, Uwamungu P, Dalal AA (2021) Prediction on the fluoride contamination in groundwater at the Datong Basin, Northern China: comparison of random forest, logistic regression and artificial neural network. Appl Geochem 132:105054
Oluyemi EA, Feuyit G, Oyekunle JA, Ogunfowokan AO (2008) Seeasonal variations in heavy metal concentrations in soil and some selected crops at a landfill in Nigeria. Afr J Environ Sci Technol 2(5):89–96
Omer AA (1996) Geological, mineralogical and geochemical studies on the Neogene and Quaternary Nile Basin deposits, Qena-Assiut Stretch, Egypt. PhD Thesis. South Valley University, Qena
Pham BT, Prakash I (2019) A novel hybrid model of Bagging-based Naïve Bayes Trees for landslide susceptibility assessment. Bull Eng Geol Environ 78(3):1911–1925
Rizeei HM, Azeez OS, Pradhan B, Khamees HH (2018) Assessment of groundwater nitrate contamination hazard in a semi-arid region by using integrated parametric IPNOA and data-driven logistic regression models. Environ Monit Assess 190:633
Russell S, Norvig P (2010) Artificial intelligence: a modern approach, 3rd edn. Prentice Hall, Upper Saddle River
Said R (1960) Planktonic foraminifera from the Thebes Formation, Luxor, Egypt. Micropaleontology 16:227–286
Said R (1981) The geological evaluation of the River Nile. Springer-Verlag, New York
Said R (1990) The geology of Egypt. A.A, Balkema, Rotterdam/Brookfield
Solanki A, Agrawal H, Khare K (2015) Predictive analysis of water quality parameters using deep learning. Int J Comput Appl 125:0975–8887
Venkataraman K, Uddameri V (2012) Modeling simultaneous exceedance of drinking-water standards of arsenic and nitrate in the Southern Ogallala aquifer using multinomial logistic regression. J Hydrol 458:16–27
Vijay S, Kamaraj K (2019a) Ground water quality prediction using machine learning algorithms in R. Int J Res Anal Rev 743:6
Vijay S, Kamaraj K (2019b) A novel approach on various machine learning algorithms for predicting ground water quality. JETIR 6(4):37–40
Wang X, Zhang F, Jianli Ding J (2017) Evaluation of water quality based on a machine learning algorithm and water quality index for the Ebinur Lake Watershed, China. Sci Rep 7:12858. https://doi.org/10.1038/s41598-017-12853-y
WHO (2011) Guidelines for drinking-water quality, vol 564, 4th edn. World Health Organization, Geneva
Youssef M, Abdel Moneim AA (2006) Evaluation of the geoenvironmental hazards in relation to the future development using the geographical information systems, East Sohag Governorate. The third international conference for development and the environment in the Arab world Assiut University, Assiut, pp 673–692
Youssef AM, Omer AA, Ibrahim MS, Ali MH, Cawlfield JD (2011) Geotechnical investigation of sewage wastewater disposal sites and use of GIS land use maps to assess environmental hazards: Sohag, Upper Egypt. Arab J Geosci 4:719–733