Phương pháp kết hợp máy học và địa lý thống kê để cải thiện độ chính xác dự đoán không gian của các nguyên tố có thể độc hại trong đất

Springer Science and Business Media LLC - Tập 37 - Trang 681-696 - 2022
Abiot Molla1,2,3, Weiwei Zhang4, Shudi Zuo1, Yin Ren1, Jigang Han4
1Key Laboratory of Urban Environment and Health, Fujian Key Laboratory of Watershed Ecology, Key Laboratory of Urban Metabolism of Xiamen, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen, China
2University of Chinese Academy of Sciences, Beijing, China
3College of Agriculture and Natural Resources, Debre Markos University, Debre Markos, Ethiopia
4Key Laboratory of National Forestry and Grassland Administration on Ecological Landscaping of Challenging Urban Sites, Shanghai Academy of Landscape Architecture Science and Planning, Shanghai, China

Tóm tắt

Quản lý môi trường hiệu quả và khắc phục ô nhiễm yêu cầu phân bố không gian chính xác và dự đoán các nguyên tố có thể độc hại (PTEs) trong đất. Tuy nhiên, không có phương pháp đơn lẻ nào được phát triển để dự đoán PTE trong đất một cách chính xác. Nghiên cứu này đánh giá phương pháp địa lý thống kê tiên tiến của dự đoán hồi quy kriging Bayes kinh nghiệm (EBKRP), thuật toán máy học của rừng ngẫu nhiên (RF), và mô hình kết hợp (RF-EBKRP) để dự đoán và lập bản đồ hàm lượng PTE trong đất xanh. Như được xác định bởi RF, carbon hữu cơ trong đất, chất hữu cơ, tổng (nitơ, phốt pho và kali), các đặc điểm địa hình, và các loại chức năng đô thị đã được sử dụng làm các biến đồng liên quan quan trọng để cải thiện độ chính xác dự đoán của PTE trong đất. Hiệu suất dự đoán của mô hình được đánh giá bằng cách sử dụng sai số bình quân gốc (RMSE), sai số tỷ lệ trung bình tuyệt đối (MAPE), và hệ số xác định (R2). Kết quả cho thấy RF hoạt động tốt hơn nhiều so với EBKRP trong việc dự đoán PTE trong đất, với các sai số dự đoán thấp hơn và R2 cao hơn. Giá trị RMSE, MAPE, và R2 cho mô hình RF lần lượt là 0.25–85.32 mg/kg, 3.86–25.40%, và 0.77–0.90, trong khi các giá trị cho phương pháp EBKRP là 0.51–99.03 mg/kg, 5.42–32.13%, và 0.40–0.66. Hơn nữa, phương pháp RF-EBKRP tạo ra các dự đoán không gian và phân bố PTE chính xác hơn so với các mô hình riêng lẻ, với sự cải thiện R2 là 122.5% cho EBKRP và 15.58% cho RF. Hiệu suất tốt hơn của RF-EBKRP là do sự kết hợp của nhiều biến đồng liên quan và khả năng xử lý các mối quan hệ phi tuyến phức tạp giữa PTE trong đất và các biến đồng liên quan. Cuối cùng, phương pháp RF-EBKRP kết hợp là một phương pháp hứa hẹn để cải thiện bản đồ phân bố không gian của PTE trong đất.

Từ khóa

#PTE #không gian #hồi quy kriging Bayes kinh nghiệm #rừng ngẫu nhiên #mô hình kết hợp #dự đoán không gian.

Tài liệu tham khảo

Adedeji OH, Olayinka OO, Tope-Ajayi OO (2019) Spatial distribution and health risk assessment of soil pollution by heavy metals in Ijebu-Ode, Nigeria. J Distrib Sci 17:1–14. https://doi.org/10.5696/2156-9614-9.22.190601 Amari T, Ghnaya T, Abdelly C (2017) Nickel, cadmium and lead phytotoxicity and potential of halophytic plants in heavy metal extraction. South Afr J Bot 111:99–110. https://doi.org/10.1016/j.sajb.2017.03.011 Ballabio C, Panagos P, Lugato E et al (2018) Copper distribution in European topsoils: An assessment based on LUCAS soil survey. Sci Total Environ 636:282–298. https://doi.org/10.1016/j.scitotenv.2018.04.268 Behrens T, Zhu A, Schmidt K, Scholten T (2010) Multi-scale digital terrain analysis and feature selection for digital soil mapping. Geoderma 155:175–185. https://doi.org/10.1016/j.geoderma.2009.07.010 Bhagat SK, Tung TM, Yaseen ZM (2019) Development of artificial intelligence for modeling wastewater heavy metal removal: State of the art, application assessment and possible future research. J Clean Prod. https://doi.org/10.1016/j.jclepro.2019.119473 Biau G, Scornet E (2016) A random forest guided tour. TEST 25:197–227. https://doi.org/10.1007/s11749-016-0481-7 Boulesteix A, Janitza S, Kruppa J (2012) Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdiscip Rev Data Min Knowl Discov 2:493–507. https://doi.org/10.1002/widm.1072 Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324 Bremner J, Jenkinson D (1960) Determination of organic carbon in soil. Eur J Soil Sci 11:394–402 Bremner JM (1960) Determination of nitrogen in soil by the Kjeldahl method. J Agric Sci 55:11–33. https://doi.org/10.1017/S0021859600021572 Cao S, Lu A, Wang J, Huo L (2017) Modeling and mapping of cadmium in soils based on qualitative and quantitative auxiliary variables in a cadmium contaminated area. Sci Total Environ 580:430–439. https://doi.org/10.1016/j.scitotenv.2016.10.088 Chai T, Draxler RR, Prediction C (2014) Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature. Geosci Model Dev 7:1247–1250. https://doi.org/10.5194/gmd-7-1247-2014 Chen T, Liu X, Li X et al (2009) Heavy metal sources identification and sampling uncertainty analysis in a field-scale vegetable soil of Hangzhou, China. Environ Pollut 157:1003–1010. https://doi.org/10.1016/j.envpol.2008.10.011 Dai F, Zhou Q, Lv Z et al (2014) Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau. Ecol Indic 45:184–194. https://doi.org/10.1016/j.ecolind.2014.04.003 Dao L, Morrison L, Zhang H, Zhang C (2014) Influences of traffic on Pb, Cu and Zn concentrations in roadside soils of an urban park in Dublin, Ireland. Environ Geochem Health 36:333–343. https://doi.org/10.1007/s10653-013-9553-8 Dubovik DV, Dubovik EV (2016) Heavy metals in ordinary chernozems on slopes of different gradients and aspects. Eurasian Soil Sci 49:33–44. https://doi.org/10.1134/S1064229316010051 EPA (1996) Environmental Protection Agency (EPA), “Method 3052: Microwave assisted acid digestion of siliceous and organically based matrices. pp 1–20 Giraldo R, Herrera L(2020) Cokriging Prediction Using as Secondary Variable a Functional Random Field with Application in Environmental Pollution. mathematics 8:1305. https://doi.org/10.3390/math8081305 González-Guzmán R, Inguaggiato C, Brusca L et al (2022) Assessment of potentially toxic elements (PTEs) sources on soils surrounding a fossil fuel power plant in a semi-arid/arid environment: A case study from the Sonoran Desert. Appl Geochem 136. https://doi.org/10.1016/j.apgeochem.2021.105158 Gribov A, Krivoruchko K (2020) Empirical Bayesian kriging implementation and usage. Sci Total Environ 722:137290. https://doi.org/10.1016/j.scitotenv.2020.137290 Guo PT, Li MF, Luo W et al (2015) Digital mapping of soil organic matter for rubber plantation at regional scale: An application of random forest plus residuals kriging approach. Geoderma 237–238:49–59. https://doi.org/10.1016/j.geoderma.2014.08.009 Ha H, Olson JR, Bian L, Rogerson PA (2014) Analysis of Heavy Metal Sources in Soil Using Kriging Interpolation on Principal Components. Environ Sci Technol 48:4999–5007 Hengl T, Heuvelink GBM, Stein A (2004) A generic framework for spatial prediction of soil variables based on regression-kriging. Geoderma 120:75–93. https://doi.org/10.1016/j.geoderma.2003.08.018 Hengl T, Nussbaum M, Wright MN et al (2018) Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. Peer J 6:e5518. https://doi.org/10.7717/peerj.5518 Hong Y, Shen R, Cheng H et al (2019) Cadmium concentration estimation in peri-urban agricultural soils: Using reflectance spectroscopy, soil auxiliary information, or a combination of both ? Geoderma 354:113875. https://doi.org/10.1016/j.geoderma.2019.07.033 Huang S, Shao G, Wang L, Tang L (2019) Spatial distribution and potential sources of five heavy metals and one metalloid in the soils of Xiamen city, China. Bull Environ Contam Toxicol 103:308–315. https://doi.org/10.1007/s00128-019-02639-5 Huang SS, Liao QL, Hua M et al (2007) Survey of heavy metal pollution and assessment of agricultural soil in Yangzhong district, Jiangsu Province, China. Chemosphere 67:2148–2155. https://doi.org/10.1016/j.chemosphere.2006.12.043 Jeong H, Choi JY, Lim J et al (2020) Characterization of the contribution of road deposited sediments to the contamination of the close marine environment with trace metals: Case of the port city of Busan (South Korea). Mar Pollut Bull 161:111717. https://doi.org/10.1016/j.marpolbul.2020.111717 Jiang Y, Chao S, Liu J et al (2017) Source apportionment and health risk assessment of heavy metals in soil for a township in Jiangsu Province, China. Chemosphere 168:1658–1668. https://doi.org/10.1016/j.chemosphere.2016.11.088 Jim P, Michael G, Taka H et al (2003) Good Practice Guidance for Land Use, Land-Use Change and Forestry. the Institute for Global Environmental Strategies (IGES) for the IPCC Keskin H, Grunwald S (2018) Regression kriging as a workhorse in the digital soil mapper ’ s toolbox ☆. Geoderma 326:22–41. https://doi.org/10.1016/j.geoderma.2018.04.004 Khaledian Y, Miller BA (2020) Selecting appropriate machine learning methods for digital soil mapping R. Appl Math Model 81:401–418. https://doi.org/10.1016/j.apm.2019.12.016 Kheir RB, Shomar B, Greve MB, Greve MH (2014) On the quantitative relationships between environmental parameters and heavy metals pollution in Mediterranean soils using GIS regression-trees: The case study of Lebanon. J Geochemical Explor 147:250–259. https://doi.org/10.1016/j.gexplo.2014.05.015 Krivoruchko K, Gribov A (2019) Evaluation of empirical Bayesian kriging. Spat Stat 32:100368. https://doi.org/10.1016/j.spasta.2019.100368 Lark RM, Cullis BR, Welham SJ (2006) On spatial prediction of soil properties in the presence of a spatial trend: The empirical best linear unbiased predictor (E-BLUP) with REML. Eur J Soil Sci 57:787–799. https://doi.org/10.1111/j.1365-2389.2005.00768.x Ließ M, Glaser B, Huwe B (2012) Uncertainty in the spatial prediction of soil texture Comparison of regression tree and Random Forest models. Geoderma 170:70–79. https://doi.org/10.1016/j.geoderma.2011.10.010 Liu X, Wu J, Xu J (2006) Characterizing the risk assessment of heavy metals and sampling uncertainty analysis in paddy field by geostatistics and GIS. Environ Pollut 141:257–264. https://doi.org/10.1016/j.envpol.2005.08.048 Liu Y, Fei X, Zhang Z et al (2020) Identifying the sources and spatial patterns of potentially toxic trace elements (PTEs) in Shanghai suburb soils using global and local regression models *. Environ Pollut 264:114171. https://doi.org/10.1016/j.envpol.2020.114171 Luo X, Yu S, Zhu Y, Li X (2012) Science of the Total Environment Trace metal contamination in urban soils of China. Sci Total Environ 421–422:17–30. https://doi.org/10.1016/j.scitotenv.2011.04.020 Maas S, Schei R, Benslama M et al (2010) Spatial distribution of heavy metal concentrations in urban, suburban and agricultural soils in a Mediterranean city of Algeria. Environ Pollut J 158:2294–2301. https://doi.org/10.1016/j.envpol.2010.02.001 Mallik S, Bhowmik T, Mishra U, Paul N (2020) Mapping and prediction of soil organic carbon by an advanced geostatistical technique using remote sensing and terrain data. Geocarto Int 0:000. https://doi.org/10.1080/10106049.2020.1815864 Manta DS, Angelone M, Bellanca A et al (2002) Heavy metals in urban soils: A case study from the city of Palermo (Sicily), Italy. Sci Total Environ 300:229–243. https://doi.org/10.1016/S0048-9697(02)00273-5 Martínez LLG, Poleto C (2014) Assessment of diffuse pollution associated with metals in urban sediments using the geoaccumulation index (Igeo). J Soils Sediments 14:1251–1257. https://doi.org/10.1007/s11368-014-0871-y Matinfar HR, Maghsodi Z, Mousavi SR, Rahmani A (2021) Evaluation and Prediction of Topsoil organic carbon using Machine learning and hybrid models at a Field-scale. CATENA 202:105258. https://doi.org/10.1016/j.catena.2021.105258 McBratney AB, Mendonça Santos ML, Minasny B (2003) On digital soil mapping. Geoderma 117:3–52. https://doi.org/10.1016/S0016-7061(03)00223-4 Miao L, Xu R, Ma Y et al (2008) Geochemistry and biogeochemistry of rare earth elements in a surface environment (soil and plant) in South China. Environ Geol 56:225–235. https://doi.org/10.1007/s00254-007-1157-0 Mico C, Recatala L, Peris M, Sa J (2006) Assessing heavy metal sources in agricultural soils of an European Mediterranean area by multivariate analysis. Chemosphere 65:863–872. https://doi.org/10.1016/j.chemosphere.2006.03.016 Minasny B, Indra B, Krido S (2018) Open digital mapping as a cost-e ff ective method for mapping peat thickness and assessing the carbon stock of tropical peatlands. Geoderma 313:25–40. https://doi.org/10.1016/j.geoderma.2017.10.018 Minasny B, McBratney AB (2016) Digital soil mapping: A brief history and some lessons. Geoderma 264:301–311. https://doi.org/10.1016/j.geoderma.2015.07.017 Minguillón MC, Cirach M, Hoek G et al (2014) Spatial variability of trace elements and sources for improved exposure assessment in Barcelona. Atmos Environ 89:268–281. https://doi.org/10.1016/j.atmosenv.2014.02.047 Mirzaee S, Ghorbani-dashtaki S, Mohammadi J et al (2016) Spatial variability of soil organic matter using remote sensing data. CATENA 145:118–127. https://doi.org/10.1016/j.catena.2016.05.023 Morley SK, Sullivan JP, Carver MR et al (2016) Comparison of electron measurements with Van Allen Probes data. Sp Weather 14:76–92. https://doi.org/10.1002/2017SW001604. Energetic Particle Data from the Global Positioning System Constellation Nussbaum M, Spiess K, Baltensweiler A et al (2018) Evaluation of digital soil mapping approaches with large sets of environmental covariates. Soil 4:1–22. https://doi.org/10.5194/soil-4-1-2018 Olsen SR, Cole CV, Watanabe FS, Dean LA (1954) Estimation of available phosphorus in soils by extraction with sodium carbonate. USDA Circ 939:1–19 Olson RS, Cava W, La, Mustahsan Z et al(2017) Data-driven advice for applying machine learning to bioinformatics problems. ArXiv Prepr Pilz J, Spöck G (2008) Why do we need and how should we implement Bayesian kriging methods. Stoch Environ Res Risk Assess 22:621–632. https://doi.org/10.1007/s00477-007-0165-7 Prasad AM, Iverson LR, Liaw A (2006) Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction. Ecosystems 9:181–199. https://doi.org/10.1007/s10021-005-0054-1 Praveena SM, Yuswir NS, Aris AZ, Hashim Z (2015) Contamination assessment and potential human health risks of heavy metals in Klang urban soils: a preliminary study. Environ Earth Sci 73:8155–8165. https://doi.org/10.1007/s12665-014-3974-2 Qiao P, Lei M, Guo G et al (2017) Quantitative Analysis of the Factors Influencing Soil Heavy Metal Lateral Migration in Rainfalls Based on Geographical Detector Software: A Case Study inHuanjiang County, China. Sustainability 9:1227. https://doi.org/10.3390/su9071227 R CoreTeam (2021) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria Requia WJ, Coull BA, Koutrakis P (2019) Evaluation of predictive capabilities of ordinary geostatistical interpolation, hybrid interpolation, and machine learning methods for estimating PM2.5 constituents over space. Environ Res 175:421–433. https://doi.org/10.1016/j.envres.2019.05.025 Shanghai Municipal Government (SMG) (2018) Shanghai master plan 2017–2035. 1–80. https://doi.org/http://www.shanghai.gov.cn/newshanghai/xxgkfj/2035004.pdf (accessed on 20 December 2020) Shi G, Chen Z, Xu S et al (2008) Potentially toxic metal contamination of urban soils and roadside dust in Shanghai, China. Environ Pollut 156:251–260. https://doi.org/10.1016/j.envpol.2008.02.027 Shi T, Hu X, Guo L et al (2021) Digital mapping of zinc in urban topsoil using multisource geospatial data and random forest. Sci Total Environ 792:148455. https://doi.org/10.1016/j.scitotenv.2021.148455 Shi W, Liu J, Du Z et al (2009) Surface modelling of soil pH. Geoderma 150:113–119. https://doi.org/10.1016/j.geoderma.2009.01.020 Shi Z, Di TM, Allen AE, L. S (2013) A General Model for Kinetics of Heavy Metal Adsorption and Desorption on Soils. Environ Sci Technol 47:3761–3767. https://doi.org/10.1021/es304524p Smith JL, Doran JW (1996) Measurement and Use of pH and Electrical Conductivity for Soil Quality Analysis. Soil Sci Soc Am J 169–185. https://doi.org/10.2136/sssaspecpub49.c10 Song Y, Zhu A, Cui X et al (2019) Spatial variability of selected metals using auxiliary variables in agricultural soils. Catena J 174:499–513. https://doi.org/10.1016/j.catena.2018.11.030 Song YQ, Yang LA, Li B et al (2017) Spatial prediction of soil organic matter using a hybrid geostatistical model of an extreme learning machine and ordinary kriging. Sustain 9. https://doi.org/10.3390/su9050754 Sun W, Minasny B, Mcbratney A (2012) Analysis and prediction of soil properties using local regression-kriging. Geoderma 171–172:16–23. https://doi.org/10.1016/j.geoderma.2011.02.010 Sundaramanickam A, Shanmugam N, Cholan S et al (2016) Spatial variability of heavy metals in estuarine, mangrove and coastal ecosystems along Parangipettai, Southeast coast of India. Environ Pollut 218:186–195. https://doi.org/10.1016/j.envpol.2016.07.048 Taghizadeh-mehrjardi R, Nabiollahi K, Kerry R (2016) Digital mapping of soil organic carbon at multiple depths using different data mining techniques in Baneh region, Iran. Geoderma 266:98–110. https://doi.org/10.1016/j.geoderma.2015.12.003 Tofallis C (2015) A better measure of relative prediction accuracy for model selection and model estimation. J Oper Res Soc 66:1352–1362. https://doi.org/10.1057/jors.2014.103 Tziachris P, Aschonitis V, Chatzistathis T, Papadopoulou M (2019) Assessment of spatial hybrid methods for predicting soil organic matter using DEM derivatives and soil parameters. CATENA 174:206–216. https://doi.org/10.1016/j.catena.2018.11.010 Vaysse K, Lagacherie P (2015) Regional Evaluating Digital Soil Mapping approaches for mapping GlobalSoilMap soil properties from legacy data in Languedoc-Roussillon (France). Geoderma Reg 4:20–30. https://doi.org/10.1016/j.geodrs.2014.11.003 Wackernagel H (1994) Cokriging versus kriging in regionalized multivariate data analysis. Geoderma 62:83–92. https://doi.org/10.1016/0016-7061(94)90029-9 Walaszek M, Bois P, Laurent J et al (2018) Urban stormwater treatment by a constructed wetland: Seasonality impacts on hydraulic efficiency, physico-chemical behavior and heavy metal occurrence. Sci Total Environ 637–638:443–454. https://doi.org/10.1016/j.scitotenv.2018.04.325 Wang F, Dong W, Zhao Z et al (2021) Heavy metal pollution in urban river sediment of different urban functional areas and its influence on microbial community structure. Sci Total Environ 778:146383. https://doi.org/10.1016/j.scitotenv.2021.146383 Wang J, Chen Z, Sun X et al (2009) Quantitative spatial characteristics and environmental risk of toxic heavy metals in urban dusts of shanghai, China. Environ Earth Sci 59:645–654. https://doi.org/10.1007/s12665-009-0061-1 Wang Y, Luo H(1992) The backgrounds of soil environment in Shanghai. China Environ Sci Press Beijing 1992 Webster R, Oliver MA (2007) Geostatistics for Environmental Scientists, Second Edi. John Wiley & Sons Ltd, England Weng L, Tipping E, Riemsdijk WHVAN (2002) Complexation with Dissolved Organic Matter and Solubility Control of Heavy Metals in a Sandy Soil. Environ Sci Technol 36:4804–4810. https://doi.org/10.1021/es0200084 Xiang M, Li Y, Yang J et al (2020) Assessment of Heavy Metal Pollution in Soil and Classification of Pollution Risk Management and Control Zones in the Industrial Developed City. Environ Manage 66:1105–1119. https://doi.org/10.1007/s00267-020-01370-w Zhang W, Han J, Molla A, Zuo S (2021a) The Optimization Strategy of the Existing Urban Green Space Soil Monitoring System in Shanghai, China. Int J Environ Res Public Heal 18:4820. https://doi.org/10.3390/ijerph18094820 Zhang YA, Yang X et al (2021b) Use of machine-learning and receptor models for prediction and source apportionment of heavy metals in coastal reclaimed soils. Ecol Indic 122:107233. https://doi.org/10.1016/j.ecolind.2020.107233 Zhi X, Chen L, Shen Z (2018) Impacts of urbanization on regional nonpoint source pollution: case study for Beijing, China. Environ Sci Pollut Res 25:9849–9860. https://doi.org/10.1007/s11356-017-1153-1 Zhang W, Han J, Molla A, Zuo S (2021a) The Optimization Strategy of the Existing Urban Green Space Soil Monitoring System in Shanghai , China. Int J Environ Res Public Heal 18:4820. https://doi.org/https://doi.org/10.3390/ijerph18094820