Accurate and Precise Prediction of Soil Properties from a Large Mid-Infrared Spectral Library

Soil Systems - Tập 3 Số 1 - Trang 11
Shree R. S. Dangal1, Jonathan Sanderman1, Skye Wills2, Leonardo Ramírez-López3
1The Woods Hole Research Center, 149 Woods Hole road, Falmouth, MA 02540, USA.
2United States Department of Agriculture—Natural Resources Conservation Service (USDA-NRCS), 100 Centennial Mall North, Lincoln, NE 68508, USA
3NIR Data Analytics, BUCHI, Labortechnik AG, Meierseggstrasse 40, 9230 Flawil, Switzerland

Tóm tắt

Diffuse reflectance spectroscopy (DRS) is emerging as a rapid and cost-effective alternative to routine laboratory analysis for many soil properties. However, it has primarily been applied in project-specific contexts. Here, we provide an assessment of DRS spectroscopy at the scale of the continental United States by utilizing the large (n > 50,000) USDA National Soil Survey Center mid-infrared spectral library and associated soil characterization database. We tested and optimized several advanced statistical approaches for providing routine predictions of numerous soil properties relevant to studying carbon cycling. On independent validation sets, the machine learning algorithms Cubist and memory-based learner (MBL) both outperformed random forest (RF) and partial least squares regressions (PLSR) and produced excellent overall models with a mean R2 of 0.92 (mean ratio of performance to deviation = 6.5) across all 10 soil properties. We found that the use of root-mean-square error (RMSE) was misleading for understanding the actual uncertainty about any particular prediction; therefore, we developed routines to assess the prediction uncertainty for all models except Cubist. The MBL models produced much more precise predictions compared with global PLSR and RF. Finally, we present several techniques that can be used to flag predictions of new samples that may not be reliable because their spectra fall outside of the calibration set.

Từ khóa


Tài liệu tham khảo

Schmidt, 2011, Persistence of soil organic matter as an ecosystem property, Nature, 478, 49, 10.1038/nature10386

Tiessen, 1994, The role of soil organic matter in sustaining soil fertility, Nature, 371, 783, 10.1038/371783a0

Foley, 2005, Global consequences of land use, Science, 309, 570, 10.1126/science.1111772

Sanderman, 2017, Soil carbon debt of 12,000 years of human land use, Proc. Natl. Acad. Sci. USA, 114, 9575, 10.1073/pnas.1706103114

Brown, 2006, Global soil characterization with VNIR diffuse reflectance spectroscopy, Geoderma, 132, 273, 10.1016/j.geoderma.2005.04.025

Janik, 2014, The performance of visible, near-, and mid-infrared reflectance spectroscopy for prediction of soil physical, chemical, and biological properties, Appl. Spectrosc. Rev., 49, 139, 10.1080/05704928.2013.811081

Rossel, 2006, Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties, Geoderma, 131, 59, 10.1016/j.geoderma.2005.03.007

Vagen, 2010, Prediction of soil fertility properties from a globally distributed soil mid-infrared spectral library, Soil Sci. Soc. Am. J., 74, 1792, 10.2136/sssaj2009.0218

Reeves, 2009, The potential of mid-and near-infrared diffuse reflectance spectroscopy for determining major-and trace-element concentrations in soils from a geochemical survey of North America, Appl. Geochem., 24, 1472, 10.1016/j.apgeochem.2009.04.017

Stevens, A., Nocita, M., Tóth, G., Montanarella, L., and van Wesemael, B. (2013). Prediction of soil organic carbon at the European scale by visible and near infrared reflectance spectroscopy. PLoS ONE, 8.

Baldock, 2014, Predicting contents of carbon and its component fractions in Australian soils from diffuse reflectance mid-infrared spectra, Soil Res., 51, 577, 10.1071/SR13077

Shi, 2015, Prediction of soil organic matter using a spatially constrained local partial least squares regression and the C hinese vis–NIR spectral library, Eur. J. Soil Sci., 66, 679, 10.1111/ejss.12272

Kuang, 2012, Influence of the number of samples on prediction error of visible and near infrared spectroscopy of selected soil properties at the farm scale, Eur. J. Soil Sci., 63, 421, 10.1111/j.1365-2389.2012.01456.x

Stenberg, 2010, Visible and near infrared spectroscopy in soil science, Advances in Agronomy, Volume 107, 163, 10.1016/S0065-2113(10)07005-7

McCarty, 2002, Mid-infrared and near-infrared diffuse reflectance spectroscopy for soil carbon measurement, Soil Sci. Soc. Am. J., 66, 640

Wijewardane, 2018, Predicting Physical and Chemical Properties of US Soils with a Mid-Infrared Reflectance Spectral Library, Soil Sci. Soc. Am. J., 82, 722, 10.2136/sssaj2017.10.0361

Palagos, 2010, Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy, TrAC Trends Anal. Chem., 29, 1073, 10.1016/j.trac.2010.05.006

Abdi, 2010, Partial least squares regression and projection on latent structure regression (PLS Regression), Wiley Interdiscip. Rev. Comput. Stat., 2, 97, 10.1002/wics.51

Mouazen, 2010, Comparison among principal component, partial least squares and back propagation neural network analyses for accuracy of measurement of selected soil properties with visible and near infrared spectroscopy, Geoderma, 158, 23, 10.1016/j.geoderma.2010.03.001

Geladi, 1986, Partial least-squares regression: A tutorial, Anal. Chim. Acta, 185, 1, 10.1016/0003-2670(86)80028-9

Vohland, 2011, Comparing different multivariate calibration methods for the determination of soil organic carbon pools with visible to near infrared spectroscopy, Geoderma, 166, 198, 10.1016/j.geoderma.2011.08.001

Brown, 2005, Validation requirements for diffuse reflectance soil characterization models with a case study of VNIR soil C prediction in Montana, Geoderma, 129, 251, 10.1016/j.geoderma.2005.01.001

McBratney, 2011, Near-infrared (NIR) and mid-infrared (MIR) spectroscopic techniques for assessing the amount of carbon stock in soils–Critical review and research perspectives, Soil Biol. Biochem., 43, 1398, 10.1016/j.soilbio.2011.02.019

Behrens, 2013, The spectrum-based learner: A new local approach for modeling soil vis–NIR spectra of complex datasets, Geoderma, 195, 268

Clairotte, 2016, National calibration of soil organic carbon concentration using diffuse infrared reflectance spectroscopy, Geoderma, 276, 41, 10.1016/j.geoderma.2016.04.021

Gomez, 2014, Which strategy is best to predict soil properties of a local site from a national Vis–NIR database?, Geoderma, 213, 1, 10.1016/j.geoderma.2013.07.016

Guerrero, 2010, Spiking of NIR regional models using samples from target sites: Effect of model size on prediction accuracy, Geoderma, 158, 66, 10.1016/j.geoderma.2009.12.021

Sankey, 2008, Comparing local vs. global visible and near-infrared (VisNIR) diffuse reflectance spectroscopy (DRS) calibrations for the prediction of soil clay, organic C and inorganic C, Geoderma, 148, 149, 10.1016/j.geoderma.2008.09.019

Wetterlind, 2010, Near-infrared spectroscopy for within-field soil characterization: small local calibrations compared with national libraries spiked with local samples, Eur. J. Soil Sci., 61, 823, 10.1111/j.1365-2389.2010.01283.x

Vasques, 2008, Comparison of multivariate methods for inferential modeling of soil carbon using visible/near-infrared spectra, Geoderma, 146, 14, 10.1016/j.geoderma.2008.04.007

Ji, 2015, Accounting for the effects of water and the environment on proximally sensed vis–NIR soil spectra and their calibrations, Eur. J. Soil Sci., 66, 555, 10.1111/ejss.12239

Guy, 2015, Spiking regional vis-NIR calibration models with local samples to predict soil organic carbon in two High Arctic polar deserts using a vis-NIR probe, Can. J. Soil Sci., 95, 237, 10.4141/cjss-2015-004

Hengl, T., de Jesus, J.M., MacMillan, R.A., Batjes, N.H., Heuvelink, G.B., Ribeiro, E., Samuel-Rosa, A., Kempen, B., Leenaars, J.G., and Walsh, M.G. (2014). SoilGrids1km—Global soil information based on automated mapping. PLoS ONE, 9.

Kuang, 2015, Comparison between artificial neural network and partial least squares for on-line visible and near infrared spectroscopy measurement of soil organic carbon, pH and clay content, Soil Tillage Res., 146, 243, 10.1016/j.still.2014.11.002

Morellos, 2016, Machine learning based prediction of soil total nitrogen, organic carbon and moisture content by using VIS-NIR spectroscopy, Biosyst. Eng., 152, 104, 10.1016/j.biosystemseng.2016.04.018

Rossel, 2016, A global spectral library to characterize the world’s soil, Earth-Sci. Rev., 155, 198, 10.1016/j.earscirev.2016.01.012

Rossel, 2010, Using data mining to model and interpret soil diffuse reflectance spectra, Geoderma, 158, 46, 10.1016/j.geoderma.2009.12.025

Minasny, 2008, Regression rules as a tool for predicting soil properties from infrared reflectance spectroscopy, Chemom. Intell. Lab. Syst., 94, 72, 10.1016/j.chemolab.2008.06.003

Steinwart, I., and Christmann, A. (2008). Support Vector Machines, Springer Science & Business Media.

Dayhoff, 2001, Artificial neural networks: Opening the black box, Cancer Interdiscip. Int. J. Am. Cancer Soc., 91, 1615

Ng, 2018, In search of an optimum sampling algorithm for prediction of soil properties from infrared spectra, PeerJ, 6, e5722, 10.7717/peerj.5722

Kang, 2008, Locally linear reconstruction for instance-based learning, Pattern Recognit., 41, 3507, 10.1016/j.patcog.2008.04.009

Tekin, 2014, Comparing the artificial neural network with parcial least squares for prediction of soil organic carbon and pH at different moisture content levels using visible and near-infrared spectroscopy, Revista Brasileira de Ciência do Solo, 38, 1794, 10.1590/S0100-06832014000600014

Madari, 2006, Mid-and near-infrared spectroscopic assessment of soil compositional parameters and structural indices in two Ferralsols, Geoderma, 136, 245, 10.1016/j.geoderma.2006.03.026

Shepherd, 2002, Development of reflectance spectral libraries for characterization of soil properties, Soil Sci. Soc. Am. J., 66, 988, 10.2136/sssaj2002.9880

Bro, 2005, Standard error of prediction for multilinear PLS: 2. Practical implementation in fluorescence spectroscopy, Chemom. Intell. Lab. Syst., 75, 69

Bradford, 2016, Managing uncertainty in soil carbon feedbacks to climate change, Nat. Clim. Chang., 6, 751, 10.1038/nclimate3071

Martens, 2000, Modified Jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR), Food Qual. Preference, 11, 5, 10.1016/S0950-3293(99)00039-7

Bouckaert, 2011, A comparison of methods for estimating prediction intervals in NIR spectroscopy: Size matters, Chemom. Intell. Lab. Syst., 109, 139, 10.1016/j.chemolab.2011.08.008

1995, Prediction error in partial least squares regression: A critique on the deviation used in The Unscrambler, Chemom. Intell. Lab. Syst., 30, 239, 10.1016/0169-7439(95)00030-5

Efron, 1983, A leisurely look at the bootstrap, the jackknife, and cross-validation, Am. Stat., 37, 36, 10.1080/00031305.1983.10483087

Ismartini, 2010, The Jackknife Interval Estimation of Parametersin Partial Least Squares Regression Modelfor Poverty Data Analysis, IPTEK J. Technol. Sci., 21, 118, 10.12962/j20882033.v21i3.42

Meinshausen, 2006, Quantile regression forests, J. Mach. Learn. Res., 7, 983

Burt, R., and Soil Survey Staff (2014). Kellogg Soil Survey Laboratory Methods Manual.

Blake, G.R., and Hartge, K.H. (1986). Bulk Density 1. Methods of Soil Analysis: Part 1—Physical and Mineralogical Methods, American Society of Agronomy—Soil Science Society of America.

R Core Team (2013). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing.

Chalmers, J.M. (2006). Mid-Infrared Spectroscopy: Anomalies, Artifacts and Common Errors. Handbook of Vibrational Spectroscopy, John Wiley & Sons, Ltd.

Kennard, 1969, Computer aided design of experiments, Technometrics, 11, 137, 10.1080/00401706.1969.10490666

Mevik, B.H., Wehrens, R., and Liland, K.H. (2019, January 28). Available online: https://cran.r-project.org/web/packages/pls/index.html.

Trevor, H., Robert, T., and JH, F. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer.

Ramirez-Lopez, L., Wadoux, A.C., Franceschini, M.H.D., Terra, F.S., Marques, K.P.P., Sayão, V.M., and Demattê, J.A.M. (2019). Robust soil mapping at the farm scale with vis–NIR spectroscopy. Eur. J. Soil Sci.

Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324

Breiman, L. (2017). Classification and Regression Trees, Routledge.

Wright, M.N., and Ziegler, A. (arXiv, 2015). Ranger: A fast implementation of random forests for high dimensional data in C++ and R, arXiv.

Hengl, T., de Jesus, J.M., Heuvelink, G.B., Gonzalez, M.R., Kilibarda, M., Blagotić, A., Shangguan, W., Wright, M.N., Geng, X., and Bauer-Marschallinger, B. (2017). SoilGrids250m: Global gridded soil information based on machine learning. PLoS ONE, 12.

Doetterl, 2015, Soil carbon storage controlled by interactions between geochemistry and climate, Nat. Geosci., 8, 780, 10.1038/ngeo2516

Kuhn, M., Weston, S., Keefer, C., Coulter, N., and Quinlan, R. (2019, January 28). Available online: https://cran.r-project.org/web/packages/Cubist/index.html.

Henderson, B., Bui, E., Moran, C., Simon, D., and Carlile, P. (2001). ASRIS: Continental-Scale Soil Property Predictions from Point Data, CSIRO Land and Water.

Chang, 2001, Near-infrared reflectance spectroscopy–principal components regression analyses of soil properties, Soil Sci. Soc. Am. J., 65, 480, 10.2136/sssaj2001.652480x

Minasny, 2013, Why you don’t need to use RPD, Pedometron, 33, 14

Meyer, 1986, Estimating uncertainty in population growth rates: Jackknife vs. bootstrap techniques, Ecology, 67, 1156, 10.2307/1938671

Wager, 2014, Confidence intervals for random forests: The jackknife and the infinitesimal jackknife, J. Mach. Learn. Res., 15, 1625

Efron, 2014, Estimation and accuracy after model selection, J. Am. Stat. Assoc., 109, 991, 10.1080/01621459.2013.823775

Hicks, 2015, Developing the Australian mid-infrared spectroscopic database using data from the Australian Soil Resource Information System, Soil Res., 53, 922, 10.1071/SR15171

Bruker (2011). Opus Spectroscopy Software Version 7, Quant User Manual, BRUKER OPTIK.

Lobsey, 2017, rs-local data-mines information from spectral libraries to improve local calibrations, Eur. J. Soil Sci., 68, 840, 10.1111/ejss.12490

Waruru, 2014, Rapid estimation of soil engineering properties using diffuse reflectance near infrared spectroscopy, Biosyst. Eng., 121, 177, 10.1016/j.biosystemseng.2014.03.003

Sila, 2016, Evaluating the utility of mid-infrared spectral subspaces for predicting soil properties, Chemom. Intell. Lab. Syst., 153, 92, 10.1016/j.chemolab.2016.02.013

Grinand, 2012, Prediction of soil organic and inorganic carbon contents at a national scale (France) using mid-infrared reflectance spectroscopy (MIRS), Eur. J. Soil Sci., 63, 141, 10.1111/j.1365-2389.2012.01429.x

Naes, 1990, Locally weighted regression and scatter correction for near-infrared reflectance data, Anal. Chem., 62, 664, 10.1021/ac00206a003

Webster, 2014, Baseline map of organic carbon in Australian soil to support national carbon accounting and monitoring under climate change, Glob. Chang. Biol., 20, 2953, 10.1111/gcb.12569

Minasny, 2009, Evaluating near infrared spectroscopy for field prediction of soil properties, Soil Res., 47, 664, 10.1071/SR09005

Farris, 1996, Parsimony jackknifing outperforms neighbor-joining, Cladistics, 12, 99

Westad, 2003, Cross validation and uncertainty estimates in independent component analysis, Anal. Chim. Acta, 490, 341, 10.1016/S0003-2670(03)00090-4

Efron, 1981, The jackknife estimate of variance, Ann. Stat., 9, 586, 10.1214/aos/1176345462

Savvides, 2010, The relationship between diffuse spectral reflectance of the soil and its cation exchange capacity is scale-dependent, Geoderma, 154, 353, 10.1016/j.geoderma.2009.11.007

Nocita, 2014, Prediction of soil organic carbon content by diffuse reflectance spectroscopy using a local partial least square regression approach, Soil Biol. Biochem., 68, 337, 10.1016/j.soilbio.2013.10.022

Sorenson, 2017, Monitoring organic carbon, total nitrogen, and pH for reclaimed soils using field reflectance spectroscopy, Can. J. Soil Sci., 97, 241, 10.1139/cjss-2016-0116

Ramirez-Lopez, L., and Stevens, A. (2019, January 28). Available online: https://cran.r-project.org/web/packages/resemble/.