Accounting for multiple testing in the analysis of spatio-temporal environmental data
Tóm tắt
The statistical analysis of environmental data from remote sensing and Earth system simulations often entails the analysis of gridded spatio-temporal data, with a hypothesis test being performed for each grid cell. When the whole image or a set of grid cells are analyzed for a global effect, the problem of multiple testing arises. When no global effect is present, we expect $$ \alpha $$% of all grid cells to be false positives, and spatially autocorrelated data can give rise to clustered spurious rejections that can be misleading in an analysis of spatial patterns. In this work, we review standard solutions for the multiple testing problem and apply them to spatio-temporal environmental data. These solutions are independent of the test statistic, and any test statistic can be used (e.g., tests for trends or change points in time series). Additionally, we introduce permutation methods and show that they have more statistical power. Real-world data are used to provide examples of the analysis, and the performance of each method is assessed in a simulation study. Unlike other simulation studies, our study compares the statistical power of the presented methods in a comprehensive simulation study. In conclusion, we present several statistically rigorous methods for analyzing spatio-temporal environmental data and controlling the false positives. These methods allow the use of any test statistic in a wide range of applications in environmental sciences and remote sensing.
Tài liệu tham khảo
Beck PSA, Goetz SJ (2012) Corrigendum: satellite observations of high northern latitude vegetation productivity changes between 1982 and 2008: ecological variability and regional differences. Environ Res Lett 7(2):029501. https://doi.org/10.1088/1748-9326/7/2/029501
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc 57(1):289–300
Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–1188. https://doi.org/10.1214/aos/1013699998
Bonferroni CE (1936) Teoria Statistica Delle Classi e Calcolo Delle Probabilità. Pubblicazioni Del R Istituto Superiore Di Scienze Economiche e Commerciali Di Firenze 8:3–62
Bretz F, Hothorn T, Westfall PH (2011) Multiple comparisons using R. CRC Press, Boca Raton
Cao J, Worsley KJ (2001) Applications of random fields in human brain mapping. In: Moore M (ed) Spatial statistics: methodological aspects and applications. Springer, New York, pp 169–182. https://doi.org/10.1007/978-1-4613-0147-9_8
DelSole T, Yang X (2011) Field significance of regression patterns. J Clim 24(19):5094–5107. https://doi.org/10.1175/2011JCLI4105.1
Dudoit S, Shaffer JP, Boldrick JC (2003) Multiple hypothesis testing in microarray experiments. Stat Sci 18(1):71–103. https://doi.org/10.1214/ss/1056397487
Dwass M (1957) Modified randomization tests for nonparametric hypotheses. Ann Math Stat 28(1):181–187. https://doi.org/10.1214/aoms/1177707045
Eckert S, Hüsler F, Liniger H, Hodel E (2015) Trend analysis of MODIS NDVI time series for detecting land degradation and regeneration in Mongolia. J Arid Environ 113:16–28. https://doi.org/10.1016/j.jaridenv.2014.09.001
Edgington ES (1969) Approximate randomization tests. J Psychol 72(2):143–149. https://doi.org/10.1080/00223980.1969.10543491
Fattorini L, Pisani C, Riga F, Zaccaroni M (2014) A permutation-based combination of sign tests for assessing habitat selection. Environ Ecol Stat 21(2):161–187. https://doi.org/10.1007/s10651-013-0250-7
Fensholt R, Proud SR (2012) Evaluation of earth observation based global long term vegetation trends—comparing GIMMS and MODIS global NDVI time series. Remote Sens Environ 119:131–147. https://doi.org/10.1016/j.rse.2011.12.015
GISTEMP Team (2019) GISS Surface Temperature Analysis (GISTEMP). https://data.giss.nasa.gov/gistemp/
Hansen J, Ruedy R, Sato M, Lo K (2010) Global surface temperature change. Rev Geophys 48:4. https://doi.org/10.1029/2010RG000345
Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4):800. https://doi.org/10.2307/2336325
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2):65–70
Julien Y, Sobrino JA (2009) Global land surface phenology trends from GIMMS database. Int J Remote Sens 30(13):3495–3513. https://doi.org/10.1080/01431160802562255
Kraft B, Jung M, Körner M, Requena Mesa C, Cortés J, Reichstein M (2019) Identifying dynamic memory effects on vegetation state using recurrent neural networks. Front Big Data 2:31. https://doi.org/10.3389/fdata.2019.00031
Kriewald S, Fluschnik T, Reusser D (2019) osc: Orthodromic Spatial Clustering (Version R package version 1.0.4). https://CRAN.R-project.org/package=osc
Livezey RE, Chen WY (1983) Statistical field significance and its determination by Monte Carlo techniques. Mon Weather Rev 111(1):46–59. https://doi.org/10.1175/1520-0493(1983)111<0046:SFSAID>2.0.CO;2
Nichols T, Hayasaka S (2003) Controlling the familywise error rate in functional neuroimaging: a comparative review. Stat Methods Med Res 12(5):419–446. https://doi.org/10.1191/0962280203sm341ra
Nichols TE, Holmes AP (2002) Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum Brain Mapp 15(1):1–25
Petersson KM, Nichols TE, Poline J-B, Holmes AP (1999) Statistical limitations in functional neuroimaging II. Signal detection and statistical inference. Philos Trans R Soc Lond 354(1387):1261–1281. https://doi.org/10.1098/rstb.1999.0478
Pinzon J, Tucker C (2014) A non-stationary 1981–2012 AVHRR NDVI3g time series. Remote Sens 6(8):6929–6960. https://doi.org/10.3390/rs6086929
R Core Team (2019) R: a language and environment for statistical computing. R foundation for statistical computing. https://www.R-project.org/
Risser MD, Paciorek CJ, Stone DA (2019) Spatially dependent multiple testing under model misspecification, with application to detection of anthropogenic influence on extreme climate events. J Am Stat Assoc 114(525):61–78. https://doi.org/10.1080/01621459.2018.1451335
Shen X, Huang H-C, Cressie N (2002) Nonparametric hypothesis testing for a spatial signal. J Am Stat Assoc 97(460):1122–1140. https://doi.org/10.1198/016214502388618933
Sun W, Reich BJ, Tony Cai T, Guindani M, Schwartzman A (2015) False discovery control in large-scale spatial multiple testing. J R Stat Soc 77(1):59–83. https://doi.org/10.1111/rssb.12064
Tucker CJ, Pinzon JE, Brown ME, Slayback DA, Pak EW, Mahoney R, Vermote EF, El Saleous N (2005) An extended AVHRR 8-km NDVI dataset compatible with MODIS and SPOT vegetation NDVI data. Int J Remote Sens 26(20):4485–4498. https://doi.org/10.1080/01431160500168686
Ventura V, Paciorek CJ, Risbey JS (2004) Controlling the proportion of falsely rejected hypotheses when conducting multiple tests with climatological data. J Clim 17(22):4343–4356. https://doi.org/10.1175/3199.1
von Storch H (1999) Misuses of statistical analysis in climate research. In: von Storch H, Navarra A (eds) Analysis of climate variability. Springer, Berlin, pp 11–26. https://doi.org/10.1007/978-3-662-03744-7_2
Walker GT (1914) Correlation in seasonal variations of weather. III. On the criterion for the reality of relationships or periodicities. Mem Indian Meteor Dept 21(9):13–15
Wilks DS (2006a) Statistical methods in the atmospheric sciences, 2nd edn. Academic Press, Cambridge
Wilks DS (2016) “The Stippling Shows Statistically Significant Grid Points”: how research results are routinely overstated and overinterpreted, and what to do about it. Bull Am Meteor Soc 97(12):2263–2273. https://doi.org/10.1175/BAMS-D-15-00267.1
Wilks DS (2006b) On “Field Significance” and the false discovery rate. J Appl Meteorol Climatol 45(9):1181–1189. https://doi.org/10.1175/JAM2404.1
Worsley KJ, Evans AC, Marrett S, Neelin P (1992) A three-dimensional statistical analysis for CBF activation studies in human brain. J Cereb Blood Flow Metab 12(6):900–918. https://doi.org/10.1038/jcbfm.1992.127
Yue S, Pilon P, Phinney B, Cavadias G (2002) The influence of autocorrelation on the ability to detect trend in hydrological series. Hydrol Process 16(9):1807–1829. https://doi.org/10.1002/hyp.1095
Zang CS, Jochner-Oette S, Cortés J, Rammig A, Menzel A (2019) Regional trend changes in recent surface warming. Clim Dyn 52(11):6463–6473. https://doi.org/10.1007/s00382-018-4524-5
Zhang Y, Song C, Band LE, Sun G, Li J (2017) Reanalysis of global terrestrial vegetation trends from MODIS products: browning or greening? Remote Sens Environ 191:145–155. https://doi.org/10.1016/j.rse.2016.12.018