Hyperspectral Data and Machine Learning for Estimating CDOM, Chlorophyll a, Diatoms, Green Algae and Turbidity

Sina Keller1, Philipp M. Maier1, Felix M. Riese1, Stefan Norra2, Andreas Holbach3, Nicolas Börsig2, Andre Wilhelms2, Christian Moldaenke4, André Zaake4, Stefan Hinz1
1Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology, Kaiserstr. 12, 76131 Karlsruhe, Germany;
2Institute of Applied Geoscience, Karlsruhe Institute of Technology, Kaiserstr. 12, 76131 Karlsruhe, Germany;
3Department of Bioscience, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark
4bbe Moldaenke GmbH, Preetzer Chaussee 177, 24222, Schwentinental, Germany

Tóm tắt

Inland waters are of great importance for scientists as well as authorities since they are essential ecosystems and well known for their biodiversity. When monitoring their respective water quality, in situ measurements of water quality parameters are spatially limited, costly and time-consuming. In this paper, we propose a combination of hyperspectral data and machine learning methods to estimate and therefore to monitor different parameters for water quality. In contrast to commonly-applied techniques such as band ratios, this approach is data-driven and does not rely on any domain knowledge. We focus on CDOM, chlorophyll a and turbidity as well as the concentrations of the two algae types, diatoms and green algae. In order to investigate the potential of our proposal, we rely on measured data, which we sampled with three different sensors on the river Elbe in Germany from 24 June–12 July 2017. The measurement setup with two probe sensors and a hyperspectral sensor is described in detail. To estimate the five mentioned variables, we present an appropriate regression framework involving ten machine learning models and two preprocessing methods. This allows the regression performance of each model and variable to be evaluated. The best performing model for each variable results in a coefficient of determination R 2 in the range of 89.9% to 94.6%. That clearly reveals the potential of the machine learning approaches with hyperspectral data. In further investigations, we focus on the generalization of the regression framework to prepare its application to different types of inland waters.

Từ khóa


Tài liệu tham khảo

Postel, 2000, Entering an era of water scarcity: The challenges ahead, Ecol. Appl., 10, 941, 10.1890/1051-0761(2000)010[0941:EAEOWS]2.0.CO;2

Hansson, 2002, Environmental issues in lakes and ponds: Current state and perspectives, Environ. Conserv., 29, 290, 10.1017/S0376892902000218

Findlay, S., and Sinsabaugh, R.L. (2003). Aquatic Ecosystems: Interactivity of Dissolved Organic Matter, Academic Press.

Suggett, D.J., Prášil, O., and Borowitzka, M.A. (2010). Chlorophyll a Fluorescence in Aquatic Sciences: Methods and Applications, Springer.

Furnas, 1990, In situ growth rates of marine phytoplankton: Approaches to measurement, community and species growth rates, J. Plankton Res., 12, 1117, 10.1093/plankt/12.6.1117

Smol, J.P., and Stoermer, E.F. (2010). The Diatoms: Applications for the Environmental and Earth Sciences, Cambridge University Press.

Yool, 2003, Role of diatoms in regulating the ocean’s silicon cycle, Glob. Biogeochem. Cycles, 17, 1, 10.1029/2002GB002018

Clesceri, L.S., Greenberg, A.E., and Eaton, A.D. (1994). Standard Methods for the Examination of Water and Wastewater, American Public Health Association.

Palmer, 2015, Remote sensing of inland waters: Challenges, progress and future directions, Remote Sens. Environ., 157, 1, 10.1016/j.rse.2014.09.021

Bukata, 2013, Retrospection and introspection on remote sensing of inland water quality: “Like Déjà Vu all over again”, J. Great Lakes Res., 39, 2, 10.1016/j.jglr.2013.04.001

Gitelson, 1990, Remote sensing of inland surface water quality–measurements in the visible spectrum, Acta Hydrophys., 34, 5

Gitelson, 1992, The peak near 700 nm on radiance spectra of algae and water: Relationships of its magnitude and position with chlorophyll concentration, Int. J. Remote Sens., 13, 3367, 10.1080/01431169208904125

Rundquist, 1996, Remote measurement of algal chlorophyll in surface waters: The case for the first derivative of reflectance near 690 nm, Photogramm. Eng. Remote Sens., 62, 195

Fraser, 1998, Hyperspectral remote sensing of turbidity and chlorophyll a among Nebraska Sand Hills lakes, Int. J. Remote Sens., 19, 1579, 10.1080/014311698215360

Menken, 2006, Influence of chlorophyll and colored dissolved organic matter (CDOM) on lake reflectance spectra: Implications for measuring lake properties by remote sensing, Lake Reserv. Manag., 22, 179, 10.1080/07438140609353895

Schalles, 1998, Estimation of chlorophyll a from time series measurements of high spectral resolution reflectance in an eutrophic lake, J. Phycol., 34, 383, 10.1046/j.1529-8817.1998.340383.x

Mannheim, S., Segl, K., Heim, B., and Kaufmann, H. (2004, January 28–30). Monitoring of lake water quality using hyperspectral CHRIS-PROBA data. Proceedings of the 2nd CHRIS/Proba Workshop, Frascati, Italy.

Hunter, 2008, Spectral discrimination of phytoplankton colour groups: The effect of suspended particulate matter and sensor spectral resolution, Remote Sens. Environ., 112, 1527, 10.1016/j.rse.2007.08.003

Ifarraguerri, 2000, Unsupervised hyperspectral image analysis with projection pursuit, IEEE Trans. Geosci. Remote Sens., 38, 2529, 10.1109/36.885200

Yu, 2010, Functional linear analysis of in situ hyperspectral data for assessing CDOM in rivers, Photogramm. Eng. Remote Sens., 76, 1147, 10.14358/PERS.76.10.1147

Brezonik, 2015, Factors affecting the measurement of CDOM by remote sensing of optically complex inland waters, Remote Sens. Environ., 157, 199, 10.1016/j.rse.2014.04.033

Maier, P.M., and Keller, S. (2018, January 23–26). Machine learning regression on hyperspectral data to estimate multiple water parameters. Proceedings of the 9th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), Amsterdam, The Netherlands.

Keller, S., Riese, F.M., Stötzer, J., Maier, P.M., and Hinz, S. (2018, January 9–12). Developing a machine learning framework for estimating soil moisture with VNIR hyperspectral data. Proceedings of the ISPRS Technical Commission I Symposium, International Society for Photogrammetry and Remote Sensing (ISPRS), Karlsruhe, Germany.

Holbach, 2014, Three Gorges Reservoir: Density pump amplification of pollutant transport into tributaries, Environ. Sci. Technol., 48, 7798, 10.1021/es501132k

Maier, 2018, Estimation of Chlorophyll a, Diatoms and Green Algae Based on Hyperspectral Data with Machine Learning Approaches, Tagungsband der 37. Wissenschaftlich-Technische Jahrestagung der DGPF e.V., Volume 27, 49

Breiman, 2001, Random Forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324

Geurts, 2006, Extremely randomized trees, Mach. Learn., 63, 3, 10.1007/s10994-006-6226-1

Freund, 1997, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci., 55, 119, 10.1006/jcss.1997.1504

Breiman, L. (1997). Arcing The Edge, Statistics Department, University of California. Technical Report 486.

Altman, 1992, An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression, Am. Stat., 46, 175, 10.1080/00031305.1992.10475879

Vapnik, V.N. (1995). The Nature of Statistical Learning Theory, Springer-Verlag New York, Inc.

Friedman, J., Hastie, T., and Tibshirani, R. (2001). The Elements of Statistical Learning, Springer.

Riese, F.M., and Keller, S. (2018, January 22–27). Introducing a Framework of Self-Organizing Maps for Regression of Soil Moisture with Hyperspectral Data. Proceedings of the 2018 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Valencia, Spain.

Kohonen, 1990, The self-organizing map, Proc. IEEE, 78, 1464, 10.1109/5.58325

Pedregosa, 2011, Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res., 12, 2825

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016). TensorFlow: A system for large-scale machine learning. arXiv.

Chen, 2014, Deep Learning-Based Classification of Hyperspectral Data, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 7, 2094, 10.1109/JSTARS.2014.2329330

Keller, 2018, Modeling Subsurface Soil Moisture Based on Hyperspectral Data: First Results of a Multilateral Field Campaign, Tagungsband der 37. Wissenschaftlich-Technische Jahrestagung der DGPF e.V., Volume 27, 34

Kingma, D.P., and Ba, J.L. (2015, January 7–9). Adam: A Method for Stochastic Optimization. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA.

Neville, 1977, Passive remote sensing of phytoplankton via chlorophyll α fluorescence, J. Geophys. Res., 82, 3487, 10.1029/JC082i024p03487

Gower, 2014, A review of ocean color remote sensing methods and statistical techniques for the detection, mapping and analysis of phytoplankton blooms in coastal and open oceans, Prog. Oceanogr., 123, 123, 10.1016/j.pocean.2013.12.008