A review of scientific advancements in datasets derived from big data for monitoring the Sustainable Development Goals

Sustainability Science - Tập 16 - Trang 1701-1716 - 2021
Cameron Allen1,2,3, Maggie Smith1, Maryam Rabiee1, Hayden Dahmm1
1Sustainable Development Solutions Network (SDSN) Thematic Research Network On Data and Statistics (TReNDS), New York, USA
2Monash Sustainable Development Institute (MSDI), Monash University, Melbourne, Australia
3School of Biological, Earth and Environmental Sciences and School of Environmental and Civil Engineering, UNSW, Sydney, Australia

Tóm tắt

The Sustainable Development Goals (SDGs) suffer from a lack of national data needed for effective monitoring and implementation. Almost half of the SDG indicators are not regularly produced, and available datasets are often out-of-date. New monitoring approaches using big data are advancing rapidly and can complement official statistics to help fill critical data gaps. However, there is poor information-sharing on the latest innovations and research collaborations across different thematic areas, and limited evaluation of strengths and weaknesses for supporting national monitoring. This paper provides a systematic review of the academic literature over the past 5 years relating to the use of big data to support monitoring of the SDGs. It reviews the state-of-the-art research using big data and advanced analytics to produce new datasets, the alignment of these datasets with the official SDG indicators, the main types and sources of big data used, and the analytical methods applied. We developed a set of evaluation criteria and applied it to highlight some of the strengths and limitations of these datasets derived from big data. We find that recent research has developed a considerable range of new datasets that could contribute to monitoring 15 goals, 51 targets, and 69 indicators. Dominant focal areas of research include land and biodiversity, health, water, cities and settlements, and poverty. Satellite and Earth Observation data were the primary sources used, most commonly applied with machine learning methods and cloud computing. However, several challenges remain, including ensuring the relevance of new datasets for monitoring SDG indicators, cost and accessibility considerations, sustainability aspects, and linking global datasets to nationally owned monitoring processes.

Tài liệu tham khảo

Ali A, Qadir J, ur Rasool R, Sathiaseelan A, Zwitter A, Crowcroft J (2016) Big data for development: applications and techniques. Big Data Anal 1(1):1–24. https://doi.org/10.1186/s41044-016-0002-4 Allen C, Metternicht G, Wiedmann T (2021) Priorities for science to support national implementation of the sustainable development goals: a review of progress and gaps. Sustainable Dev. https://doi.org/10.1002/sd.2164 Andreano MS, Benedetti R, Piersimoni F, Savio G (2020) Mapping poverty of Latin American and Caribbean countries from heaven through night-light satellite images. Soc Indic Res. https://doi.org/10.1007/s11205-020-02267-1 Andries A, Morse S, Murphy R, Lynch J, Woolliams E, Fonweban J (2019) Translation of Earth observation data into sustainable development indicators: an analytical framework. Sustain Dev 27:366–376. https://doi.org/10.1002/sd.1908 Avtar R, Aggarwal R, kharrazi A, kumar P, kurniawan TA (2020) Utilizing geospatial information to implement SDGs and monitor their progress. Environ Moni Assess 192:35. https://doi.org/10.1007/s10661-019-7996-9 Bian J, Li A, Lei G, Zhang Z, Nan X (2020) Global high-resolution mountain green cover index mapping based on Landsat images and Google Earth Engine. ISPRS J Photogramm Remote Sens 162:63–76. https://doi.org/10.1016/j.isprsjprs.2020.02.011 Blazquez D, Domenech J (2018) Big data sources and methods for social and economic analyses. Technol Forecast Soc Change 130:99–113. https://doi.org/10.1016/j.techfore.2017.07.027 Campbell J, Sahou JJ, Sebukeera C, Giada S, Gilman J, Hur YR, Salem J, Nagatani-Yoshida K, Zhang J. & Billot M (2019) Measuring Progress: Towards Achieving the Environmental Dimension of the SDGs. United Nations Environment Programme, Nairobi Cazarez-Grageda K & Zougbede K (2019) National SDG Review: data challenges and opportunities. Paris21 and Partners for Review, Paris Daas PJ, Puts MJ, den BuelensVanHurk BPA (2015) Big data as a source for official statistics. J off Stat 31:249–262. https://doi.org/10.1515/jos-2015-0016 Data-Pop Alliance (2016) Opportunities and Requirements for Leveraging Big Data for Official Statistics and the Sustainable Development Goals in Latin America. Data-Pop Alliance, New York Di Bella E, Leporatti L, Maggino F (2018) Big data and social indicators: actual trends and new perspectives. Soc Indic Res 135:869–878. https://doi.org/10.1007/s11205-016-1495-y Falchetta G, Pachauri S, Parkinson S, Byers E (2019) A high-resolution gridded dataset to assess electrification in sub-Saharan Africa. Scientific Data 6:1–9. https://doi.org/10.1038/s41597-019-0122-6 Fatehkia M, Tingzon I, Orden A, Sy S, Sekara V, Garcia-herranz M, Weber I (2020) Mapping socioeconomic indicators using social media advertising data. EPJ Data Sci 9:22. https://doi.org/10.1140/epjds/s13688-020-00235-w Florescu D, Karlberg M, Reis F, Del castillo PR, Skaliotis M & Wirthmann A (2014) Will ‘big data’ transform official statistics. European Conference on the Quality of Official Statistics 2014. Vienna, Austria Funk C, Peterson P, Landsfeld M, Pedreros D, Verdin J, Shukla S, Husak G, Rowland J, Harrison L, Hoell A (2015) The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes. Scientific Data 2:1–21. https://doi.org/10.1038/sdata.2015.66 Gandomi A, Haider M (2015) Beyond the hype: Big data concepts, methods, and analytics. Int J Inf Manag 35:137–144. https://doi.org/10.1016/j.ijinfomgt.2014.10.007 GBD (2019a) Diseases and injuries collaborators 2020. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 396:1204–1222. https://doi.org/10.1016/S0140-6736(20)30925-9 GBD (2019b) Universal health coverage collaborators 2020. Measuring universal health coverage based on an index of effective coverage of health services in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 396:1250–1284. https://doi.org/10.1016/S0140-6736(20)30750-9 Giuliani G, Chatenoux B, Piller T, Moser F, Lacroix P (2020a) Data Cube on Demand (DCoD): generating an earth observation Data Cube anywhere in the world. Int J Appl Earth Obs Geoinf 87:102035. https://doi.org/10.1016/j.jag.2019.102035 Giuliani G, Mazzetti P, Santoro M, Nativi S, Van Bemmelen J, Colangeli G, Lehmann A (2020b) Knowledge generation using satellite earth observations to support sustainable development goals (SDG): a use case on Land degradation. Int J Appl Earth Obs Geoinf 88:102068. https://doi.org/10.1016/j.jag.2020.102068 Gosling J, Jones MI, Arnell A, Watson JE, Venter O, Baquero AC, Burgess ND (2020) A global mapping template for natural and modified habitat across terrestrial Earth. Biol Conserv 250:108674. https://doi.org/10.1016/j.biocon.2020.108674 Graetz N, Friedman J, Osgood-zimmerman A, Burstein R, Biehl MH, Shields C, Mosser JF, Casey DC, Deshpande A, Earl L (2018) Mapping local variation in educational attainment across Africa. Nature 555:48–53. https://doi.org/10.1038/nature25761 Guo H, Goodchild MF & Annoni A (2020) Manual of Digital Earth, Springer Nature, Singapore Hansen A, Barnett K, Jantz P, Phillips L, Goetz SJ, Hansen M, Venter O, Watson JE, Burns P, Atkinson S (2019) Global humid tropics forest structural condition and forest structural integrity maps. Scientific Data 6:1–12. https://doi.org/10.1038/s41597-019-0214-3 Hassani H, Saporta G, Silva ES (2014) Data mining and official statistics: the past, the present and the future. Big Data 2:34–43. https://doi.org/10.1089/big.2013.0038 de HenglMendes jesus TJ, Heuvelink GB, Ruiperez gonzalez M, Kilibarda M, Blagotić A, Shangguan W, Wright MN, Geng X, Bauer-marschallinger B (2017) SoilGrids250m: Global gridded soil information based on machine learning. PLoS ONE 12:e0169748. https://doi.org/10.1371/journal.pone.0169748 Hersh J, Engstrom R, Mann M (2020) Open data for algorithms: mapping poverty in Belize using open satellite derived features and machine learning. Inf Technol Dev. https://doi.org/10.1080/02681102.2020.1811945 IAEG-SDGS (2020) Tier classification for global SDG indicators. Inter-agency and expert group on SDG indicators, New York Jacques DC (2018) Harnessing the Data Revolution for Food Security and Poverty Mapping: synergies between Mobile Phone Data, Earth Observation and Official Statistics, PhD Thesis, Universite Catholique de Louvain. arXiv:1806.03086 Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S (2016) Combining satellite imagery and machine learning to predict poverty. Science 353:790–794. https://doi.org/10.1126/science.aaf7894 Jung M, Dahal PR, Butchart SH, Donald PF, De Lamo X, Lesiv M, Kapos V, Rondinini C, Visconti P (2020) A global map of terrestrial habitat types. Scientific Data 7:1–8. https://doi.org/10.1038/s41597-020-00599-8 Kashyap R, Fatehkia M, AlTamimeWeber RI (2020) Monitoring global digital gender inequality using the online populations of Facebook and Google. Demogr Research 43:779–816. https://doi.org/10.4054/DemRes.2020.43.27 Kavvada A, Metternicht G, Kerblat F, Mudau N, Haldorson M, Laldaparsad S, Friedl L, Held A, Chuvieco E (2020) Towards delivering on the sustainable development goals using earth observations. Elsevier Kickbusch I, Hanefeld J (2017) Role for academic institutions and think tanks in speeding progress on sustainable development goals. BMJ 358:j3519. https://doi.org/10.1136/bmj.j3519 Kilic T, Serajuddin U, Uematsu H, Yoshida N (2017) Costing household surveys for monitoring progress toward ending extreme poverty and boosting shared prosperity. The World Bank, Washington, DC Kitchin R (2015) The opportunities, challenges and risks of big data for official statistics. Stat J IAOS 31:471–481. https://doi.org/10.2139/ssrn.2595075 Ladeau S, Han B, Rosi-marshall E, Weathers K (2017) The next decade of big data in ecosystem science. Ecosystems 20:274–283. https://doi.org/10.1007/s10021-016-0075-y Laney D (2001) 3D data management: controlling data volume, velocity and variety. META Group Res Note 6:1 Li Y, Yu M, Xu M, Yang J, Sha D, Liu Q, Yang C (2020) Big data and cloud computing. Manual Digit Earth. https://doi.org/10.1007/978-981-32-9915-3_9 Macfeely S (2019) The big (data) bang: opportunities and challenges for compiling SDG indicators. Global Pol 10:121–133. https://doi.org/10.1111/1758-5899.12595 Macfeely S, Nastav B (2019) “You say you want a [data] revolution”: a proposal to use unofficial statistics for the SDG Global Indicator Framework. Stat J IAOS 35:309–327. https://doi.org/10.3233/SJI-180486 Marconcini M, Metz-marconcini A, Üreyen S, Palacios-lopez D, Hanke W, Bachofer F, Zeidler J, Esch T, Gorelick N, Kakarla A (2020) Outlining where humans live, the World Settlement Footprint 2015. Scientific Data 7:1–14. https://doi.org/10.1038/s41597-020-00580-5 Metternicht G, Mueller N, Lucas R (2020) Digital Earth for sustainable development goals. Manual of digital earth. Springer, Singapore Meyer MF, Labou SG, Cramer AN, Brousil MR, Luff BT (2020) The global lake area, climate, and population dataset. Scientific Data 7:1–12. https://doi.org/10.1038/s41597-020-0517-4 Mondal P, Liu X, Fatoyinbo TE, Lagomasino D (2019) Evaluating combinations of sentinel-2 data and machine-learning algorithms for mangrove mapping in West Africa. Remote Sens 11:2928. https://doi.org/10.3390/rs11242928 OECD (2012) Quality framework and guidelines for OECD statistical activities. Organisation for economic cooperation and development, Paris Osgood-zimmerman A, Millear AI, Stubbs RW, Shields C, Pickering BV, Earl L, Graetz N, Kinyoki DK, Ray SE, Bhatt S (2018) Mapping child growth failure in Africa between 2000 and 2015. Nature 555:41–47. https://doi.org/10.1038/nature25760 Pekel J-F, Cottam A, Gorelick N, Belward AS (2016) High-resolution mapping of global surface water and its long-term changes. Nature 540:418–422. https://doi.org/10.1038/nature20584 Perera-gomez T, Lokanathan S (2017) Leveraging big data to support measurement of the sustainable development goals. https://doi.org/10.2139/ssrn.3058530 Plag H-P, Jules-plag S-A (2020) A goal-based approach to the identification of essential transformation variables in support of the implementation of the 2030 agenda for sustainable development. Int J Digit Earth 13:166–187. https://doi.org/10.1080/17538947.2018.1561761 Pokhriyal N, Jacques DC (2017) Combining disparate data sources for improved poverty prediction and mapping. Proc Natl Acad Sci 114:E9783–E9792. https://doi.org/10.1073/pnas.1700319114 Radermacher WJ (2018) Official statistics in the era of big data opportunities and threats. Intl J Data Sci Anal 6:225–231. https://doi.org/10.1007/s41060-018-0124-z Reimsbach-kounatze C (2015) The proliferation of “Big Data” and implications for official statistics and statistical agencies. Organisation for economic cooperation and development, Paris Sachs J, Schmidt-traub G, Kroll C, Lafortune G, Fuller G, Woelm F (2020) The sustainable development goals and COVID-19: sustainable development report 2020. Cambridge University Press, United Kingdom Saura S, Bertzky B, Bastin L, Battistella L, Mandrici A, Dubois G (2019) Global trends in protected area connectivity from 2010 to 2018. Biol Conserv 238:108183. https://doi.org/10.1016/j.biocon.2019.07.028 Sayre R, Karagulle D, Frye C, Boucher T, Wolff NH, Breyer S, Wright D, Martin M, Butler K, Van Graafeiland K (2020) An assessment of the representation of ecosystems in global protected areas using new maps of World Climate Regions and World Ecosystems. Glob Ecol Conserv 21:e00860. https://doi.org/10.1016/j.gecco.2019.e00860 Scannapieco M, Virgillito A, & Zardetto D (2013). Placing big data in official statistics: a big challenge. New Techniques and Technologies in Statistics Conference 2013, Brussels Schiavina M, Melchiorri M, Corbane C, Florczyk AJ, Freire S, Pesaresi M, Kemper T (2019) Multi-scale estimation of land use efficiency (SDG 11.3. 1) across 25 years using global open and free data. Sustainability 11(20):5674. https://doi.org/10.3390/su11205674 Schmalzbauer B & Visbeck M (2016) The contribution of science in implementing the Sustainable Development Goals. German Committee Future Earth, Stuttgart Schnorr-baecker S (2017) Statistical monitoring systems to inform policy decision-making, and new data sources. Stat J IAOS 33:407–421 Scott G, Rajabifard A (2017) Sustainable development and geospatial information: a strategic framework for integrating a global policy agenda into national geospatial capabilities. Geo-Spatial Inf Sci 20:59–76. https://doi.org/10.1080/10095020.2017.1325594 See L, Fritz S, Moorthy I, Danylo O, Van Dijk M & Ryan B (2018) Using Remote Sensing and Geospatial Information for Sustainable Development. In: Desai, R., Kato, H., Kharas, H. & Mcarthur, J. (eds.) From Summits to Solutions: Innovations in Implementing the Sustainable Development Goals. Brookings Institution Press, Washington, DC Shaddick G, Thomas M, Mudu P, Ruggeri G, Gumy S (2020) Half the world’s population are exposed to increasing air pollution. NPJ Clim Atmos Sci 3:1–5. https://doi.org/10.1038/s41612-020-0124-2 Sheehan E, Meng C, Tan M, Uzkent B, Jean N, Burke M, Lobell D & Ermon S (2019) Predicting economic development using geolocated wikipedia articles. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2019, Anchorage, 2698–2706. Sivarajah U, Kamal MM, Irani Z, Weerakkody V (2017) Critical analysis of Big Data challenges and analytical methods. J Business Res 70:263–286. https://doi.org/10.1016/j.jbusres.2016.08.001 Steele JE, Sundsøy PR, Pezzulo C, Alegana VA, Bird TJ, Blumenstock J, Bjelland J, Engø-monsen K, De Montjoye Y-A, Iqbal AM (2017) Mapping poverty using mobile phone and satellite data. J R Soc Interface 14:20160690. https://doi.org/10.1098/rsif.2016.0690 Struijs P, Braaksma B, Daas PJ (2014) Official statistics and big data. Big Data Soc. https://doi.org/10.1177/2053951714538417 Tam S-M, Van Halderen G (2020) The five V’s, seven virtues and ten rules of big data engagement for official statistics. Stat J IAOS 36:423–433. https://doi.org/10.3233/SJI-190595 Tam SM, Clarke F (2015) Big data, official statistics and some initiatives by the Australian Bureau of Statistics. Int Stat Rev 83:436–448. https://doi.org/10.1111/insr.12105 Un Statistics Division (2017) Guidelines and best practices on data flows and global data reporting for sustainable development goals. Statistics Division, New York UNECOSOC (2013) Fundamental Principles of Official Statistics. Resolution adopted by the United Nations Economic and Social Council on 24 July 2013. United Nations, New York UNESCAP (2021) Big Data for the SDGs: Country examples in compiling SDG indicators using non-traditional sources. United Nations Economic and Social Commission for Asia and the Pacific, Bangkok United Nations General Assembly (2017) Work of the statistical commission pertaining to the 2030 agenda for sustainable development. United Nations General Assembly, New York United Nations Statistical Commission (2014) Big data and modernization of statistical systems. Report of the Secretary-General E/CN. 3.2014/11 of the forty-fifth session of UNSC 4–7 March 2014. United Nations, New York Van Den Homberg M, Susha I (2018) Characterizing data ecosystems to support official statistics with open mapping data for reporting on sustainable development goals. ISPRS Int J Geo-Inf 7:456. https://doi.org/10.3390/ijgi7120456 Watmough GR, Marcinko CL, Sullivan C, Tschirhart K, Mutuo PK, Palm CA, Svenning J-C (2019) Socioecologically informed use of remote sensing data to predict rural household poverty. Proc National Acad Sci 116:1213–1218. https://doi.org/10.1073/pnas.1812969116 Weiss DJ, Nelson A, Gibson H, Temperley W, Peedell S, Lieber A, Hancher M, Poyart E, Belchior S, Fullman N (2018) A global map of travel time to cities to assess inequalities in accessibility in 2015. Nature 553:333–336. https://doi.org/10.1038/nature25181 Whitcraft AK, Becker-reshef I, Justice CO, Gifford L, Kavvada A, Jarvis I (2019) No pixel left behind: toward integrating Earth observations for agriculture into the United Nations Sustainable Development Goals framework. Remote Sens Environ 235:111470. https://doi.org/10.1016/j.rse.2019.111470 Wu B, Tian F, Zhang M, Zeng H, Zeng Y (2020) Cloud services with big data provide a solution for monitoring and tracking sustainable development goals. Geogr Sustainability. https://doi.org/10.1016/j.geosus.2020.03.006 Yeh C, Perez A, Driscoll A, Azzari G, Tang Z, Lobell D, Ermon S, Burke M (2020) Using publicly available satellite imagery and deep learning to understand economic well-being in Africa. Nature Commun 11:1–11. https://doi.org/10.1038/s41467-020-16185-w