Pièges et mésusages en analyse de données

ANESTHESIE & REANIMATION - Tập 9 - Trang 440-450 - 2023
Olivier Supplisson1,2, Mircea T. Sofonea3,4
1CIRB, CNRS, Inserm, Collège de France, Paris, France
2Sorbonne Université, Paris, France
3PCCEI, Univ Montpellier, Inserm, EFS, Montpellier, France
4CHU de Nîmes, Nîmes, France

Tài liệu tham khảo

Cohen, 1994, The Earth is round (P < .05), Am Psychol, 49, 997, 10.1037/0003-066X.49.12.997

Altman, 1994, Problems in dichotomizing continuous variables, Am J Epidemiol, 139, 442, 10.1093/oxfordjournals.aje.a117020

Cristea, 2018, P values in display items are ubiquitous and almost invariably significant: A survey of top science journals, PLOS ONE, 13, e0197440, 10.1371/journal.pone.0197440

Cohen, 1938, The misuse of statistics, J Am Stat Assoc, 33, 657, 10.1080/01621459.1938.10502344

Campbell, 2004

Edler, 2003, Randomized clinical trial: myths around elementary statistical principles, Oncol Res Treat, 26, 551, 10.1159/000074150

Goodman, 2016, What does research reproducibility mean?, Sci Transl Med, 8, 341ps12, 10.1126/scitranslmed.aaf5027

Greenland, 2019, Valid P-values behave exactly as they should: some misleading criticisms of P-values and their resolution with S-values, Am Stat, 73, 106, 10.1080/00031305.2018.1529625

Heinze, 2017, Five myths about variable selection, Transpl Int, 30, 6, 10.1111/tri.12895

Ioannidis, 2019, What have we (not) learnt from millions of scientific papers with P values?, Am Stat, 73, 20, 10.1080/00031305.2018.1447512

Makin, 2019, Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. Rodgers P, Parsons N, Holmes N, editors, eLife, 8, e48175, 10.7554/eLife.48175

Nakagawa, 2007, Effect size, confidence interval and statistical significance: a practical guide for biologists, Biol Rev, 82, 591, 10.1111/j.1469-185X.2007.00027.x

van Smeden, 2020, Reflection on modern methods: five myths about measurement error in epidemiological research, Int J Epidemiol, 49, 338, 10.1093/ije/dyz251

Amrhein, 2019, Scientists rise up against statistical significance, Nature, 567, 305, 10.1038/d41586-019-00857-9

Benjamin, 2018, Redefine statistical significance, Nat Hum Behav, 2, 6, 10.1038/s41562-017-0189-z

Devezer, 2021, The case for formal methodology in scientific reform, R Soc Open Sci, 8, 200805, 10.1098/rsos.200805

McShane, 2019, Abandon statistical significance, Am Stat., 73, 235, 10.1080/00031305.2018.1527253

Wasserstein, 2016, The ASA statement on P-values: context, process, and purpose, Am Stat, 70, 129, 10.1080/00031305.2016.1154108

Stang, 2010, The ongoing tyranny of statistical significance testing in biomedical research, Eur J Epidemiol, 2, 225, 10.1007/s10654-010-9440-x

Yaddanapudi, 2016, The American Statistical Association statement on P-values explained, J Anaesthesiol Clin Pharmacol, 32, 421, 10.4103/0970-9185.194772

Greenland, 2023, Divergence versus decision P-values: A distinction worth making in theory and keeping in practice: Or, how divergence P-values measure evidence even when decision P-values do not, Scand J Stat, 50, 54, 10.1111/sjos.12625

Garamszegi, 2017, Perturbations on the uniform distribution of p-values can lead to misleading inferences from null-hypothesis testing, Trends Neurosci Educ, 8-9, 18, 10.1016/j.tine.2017.10.001

Nieuwenhuis, 2011, Erroneous analyses of interactions in neuroscience: a problem of significance, Nat Neurosci, 14, 1105, 10.1038/nn.2886

Freckleton, 2002, On the misuse of residuals in ecology: regression of residuals vs. multiple regression, J Anim Ecol, 71, 542, 10.1046/j.1365-2656.2002.00618.x

Franke, 2012, The Chi-Square Test: often used and more often misinterpreted, Am J Eval, 33, 448, 10.1177/1098214011426594

Johnson, 2013, Revised standards for statistical evidence, Proc Natl Acad Sci, 110, 19313, 10.1073/pnas.1313476110

Head, 2015, The extent and consequences of P-hacking in science, PLOS Biol, 13, e1002106, 10.1371/journal.pbio.1002106

Stefan, 2023, Big little lies: a compendium and simulation of P-hacking strategies, R Soc Open Sci, 10, 220346, 10.1098/rsos.220346

Westfall, 2011, On using the bootstrap for multiple comparisons, J Biopharm Stat, 21, 1187, 10.1080/10543406.2011.607751

Blume, 2018, Second-generation P-values: Improved rigor, reproducibility, & transparency in statistical analyses. Smalheiser NR, ed, PLos One, 13, e0188299, 10.1371/journal.pone.0188299

Dmitrienko, 2018, Multiplicity considerations in clinical trials, N Engl J Med., 378, 2115, 10.1056/NEJMra1709701

Lantz, 2013, The large sample size fallacy, Scand J Caring Sci, 27, 487, 10.1111/j.1471-6712.2012.01052.x

Dettori, 2019, P-value worship: is the idol significant?, Glob Spine J, 9, 357, 10.1177/2192568219838538

Fethney, 2010, Statistical and clinical significance, and how to use confidence intervals to help interpret both, Aust Crit Care, 23, 93, 10.1016/j.aucc.2010.03.001

Hentschke, 2011, Computation of measures of effect size for neuroscience data sets, Eur J Neurosci, 34, 1887, 10.1111/j.1460-9568.2011.07902.x

Lee, 2016, Alternatives to P value: confidence interval and effect size, Korean J Anesthesiol, 69, 555, 10.4097/kjae.2016.69.6.555

Ioannidis, 2019, The importance of predefined rules and prespecified statistical analyses: do not abandon significance, JAMA, 321, 2067, 10.1001/jama.2019.4582

Lakens, 2021, The practical alternative to the P value is the correctly used P value, Perspect Psychol Sci, 16, 639, 10.1177/1745691620958012

Benjamin, 2019, Three recommendations for improving the use of P-values, Am Stat, 73, 186, 10.1080/00031305.2018.1543135

Hurlbert, 2019, Coup de grace for a tough old bull: “statistically significant” expires, Am Stat, 73, 352, 10.1080/00031305.2018.1543616

van de Schoot, 2021, Bayesian statistics and modelling, Nat Rev Methods Primer, 1, 1, 10.1038/s43586-020-00001-2

Nosek, 2018, The preregistration revolution, Proc Natl Acad Sci, 115, 2600, 10.1073/pnas.1708274114

DeCoster, 2009, A conceptual and empirical examination of justifications for dichotomization, Psychol Methods, 14, 349, 10.1037/a0016956

MacCallum, 2002, On the practice of dichotomization of quantitative variables, Psychol Methods, 7, 19, 10.1037/1082-989X.7.1.19

Holländer, 2004, Confidence intervals for the effect of a prognostic factor after selection of an ‘optimal’ cutpoint, Stat Med, 23, 1701, 10.1002/sim.1611

Marra, 2010, Penalised regression splines: theory and application to medical research, Stat Methods Med Res, 19, 107, 10.1177/0962280208096688

Randall, 2021, How did we get here: what are droplets and aerosols and how far do they go?. A historical perspective on the transmission of respiratory infectious diseases, Interface Focus, 11, 20210049, 10.1098/rsfs.2021.0049

Berk, 2013, Valid post-selection inference, Ann Stat, 41, 802, 10.1214/12-AOS1077

Whittingham, 2006, Why do we still use stepwise modelling in ecology and behaviour?, J Anim Ecol, 75, 1182, 10.1111/j.1365-2656.2006.01141.x

Hoeting, 2009, The importance of accounting for spatial and temporal correlation in analyses of ecological data, Ecol Appl, 19, 574, 10.1890/08-0836.1

Leeb, 2005, Model selection and inference: facts and fiction, Econ Theory, 21, 21, 10.1017/S0266466605050036

Leeb, 2008, Can one estimate the unconditional distribution of post-model-selection estimators?, Econ Theory, 24, 338, 10.1017/S0266466608080158

Cinelli, 2022, A crash course in good and bad controls, Sociol Methods Res, 10.1177/00491241221099552

Schisterman, 2009, Overadjustment bias and unnecessary adjustment in epidemiologic studies, Epidemiology, 20, 488, 10.1097/EDE.0b013e3181a819a1

Tobler, 1970, A computer movie simulating urban growth in the Detroit region, Econ Geogr, 46, 234, 10.2307/143141

Griffith, 1992, What is spatial autocorrelation? Reflections on the past 25 years of spatial statistics, Espace Geogr, 21, 265, 10.3406/spgeo.1992.3091

Cori, 2013, A new framework and software to estimate time-varying reproduction numbers during epidemics, Am J Epidemiol, 178, 1505, 10.1093/aje/kwt133

MacNab, 2011, On Gaussian Markov random fields and Bayesian disease mapping, Stat Methods Med Res, 20, 49, 10.1177/0962280210371561

Gómez-Rubio, 2021, Estimating spatial econometrics models with integrated nested Laplace approximation, Mathematics, 9, 2044, 10.3390/math9172044

Conley, 1999, GMM estimation with cross sectional dependence, J Econom, 92, 1, 10.1016/S0304-4076(98)00084-0

Dupont, 2022, Spatial+: A novel approach to spatial confounding, Biometrics, 78, 1279, 10.1111/biom.13656

Khan, 2023

Britton, 2019, Estimation in emerging epidemics: biases and remedies, J R Soc Interface, 16, 20180670, 10.1098/rsif.2018.0670

Lipsitch, 2015, Potential biases in estimating absolute and relative case-fatality risks during outbreaks, PLoS Negl Trop Dis, 9, e0003846, 10.1371/journal.pntd.0003846

Russell, 2020, Estimating the infection and case fatality ratio for coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the Diamond Princess cruise ship, February 2020, Eurosurveillance, 25, 2000256, 10.2807/1560-7917.ES.2020.25.12.2000256

Alizon, 2022, Epidemiological and clinical insights from SARS-CoV-2 RT-PCR crossing threshold values, France, January to November 2020, Eurosurveillance, 27, 2100406, 10.2807/1560-7917.ES.2022.27.6.2100406

Hay, 2021, Estimating epidemiologic dynamics from cross-sectional viral load distributions, Science, 373, eabh0635, 10.1126/science.abh0635

Woodfine, 2015, Berkson's paradox in medical care, J Intern Med, 278, 424, 10.1111/joim.12363

Monge, 2023, The imprinting effect of Covid-19 vaccines: an expected selection bias in observational studies, BMJ, 381, e074404, 10.1136/bmj-2022-074404

Brenner, 1997, Variation of sensitivity, specificity, likelihood ratios and predictive values with disease prevalence, Stat Med, 16, 981, 10.1002/(SICI)1097-0258(19970515)16:9<981::AID-SIM510>3.0.CO;2-N

Westreich, 2014, Epidemiology visualized: The prosecutor's fallacy, Am J Epidemiol, 179, 1125, 10.1093/aje/kwu025

Keogh, 2020, STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 1 — Basic theory and simple methods of adjustment, Stat Med, 39, 2197, 10.1002/sim.8532

Neuhaus, 1999, Bias and efficiency loss due to misclassified responses in binary regression, Biometrika, 86, 843, 10.1093/biomet/86.4.843

Loken, 2017, Measurement error and the replication crisis, Science, 355, 584, 10.1126/science.aal3618

Shaw, 2020, STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 2 — More complex methods of adjustment and advanced topics, Stat Med, 39, 2232, 10.1002/sim.8531

Innes, 2021, The measurement error elephant in the room: challenges and solutions to measurement error in epidemiology, Epidemiol Rev, 43, 94, 10.1093/epirev/mxab011

Sedgwick, 2015, Understanding the ecological fallacy, BMJ, 351, h4773, 10.1136/bmj.h4773

von Kügelgen, 2021, Simpson's paradox in COVID-19 case fatality rates: a mediation analysis of age-related causal effects, IEEE Trans Artif Intell, 2, 18, 10.1109/TAI.2021.3073088

Lee, 2021, Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework, J Clin Epidemiol, 134, 79, 10.1016/j.jclinepi.2021.01.008

Enders, 2022

Seaman, 2013, What is meant by “missing at random”?, Stat Sci, 28, 257, 10.1214/13-STS415

Goldberg, 2021, Data missing not at random in mobile health research: assessment of the problem and a case for sensitivity analyses, J Med Internet Res, 23, e26749, 10.2196/26749

Lachin, 2016, Fallacies of last observation carried forward analyses, Clin Trials, 13, 161, 10.1177/1740774515602688

Eekhout, 2012, A systematic review of how they are reported and handled, Epidemiology, 23, 729, 10.1097/EDE.0b013e3182576cdb

Hunt, 2021, A systematic review of how missing data are handled and reported in multi-database pharmacoepidemiologic studies, Pharmacoepidemiol Drug Saf, 30, 819, 10.1002/pds.5245

Little, 2021, Missing data assumptions, Annu Rev Stat Its Appl, 8, 89, 10.1146/annurev-statistics-040720-031104

Adams-Huet, 2009, Bridging clinical investigators and statisticians: writing the statistical methodology for a research proposal, J Investig Med, 57, 818, 10.2310/JIM.0b013e3181c2996c

Silberzahn, 2018, Many analysts, one data set: making transparent how variations in analytic choices affect results, Adv Methods Pract Psychol Sci, 1, 337, 10.1177/2515245917747646

Greenhalgh, 2022, Adapt or die: how the pandemic made the shift from EBM to EBM+ more urgent, BMJ Evid-Based Med, 27, 253, 10.1136/bmjebm-2022-111952

Peng, 2015, The reproducibility crisis in science: A statistical counterattack, Significance, 12, 30, 10.1111/j.1740-9713.2015.00827.x

Ipsos, 2022