Best (but oft-forgotten) practices: the multiple problems of multiplicity—whether and how to correct for many statistical tests

The American Journal of Clinical Nutrition - Tập 102 - Trang 721-728 - 2015
David L Streiner1
1Department of Psychiatry and Behavioral Neurosciences, McMaster University, Hamilton, Canada, and Department of Psychiatry, University of Toronto, Toronto, Canada

Tài liệu tham khảo

Bennett, 2010, Neural correlates of interspecies perspective taking in the post-mortem Atlantic salmon: an argument for proper multiple comparisons correction, Journal of Serendipitous and Unexpected Results, 1, 1 Austin, 2006, Testing multiple statistical hypotheses resulted in spurious associations: a study of astrological signs and health, J Clin Epidemiol, 59, 964, 10.1016/j.jclinepi.2006.01.012 McNeill, 2006, Metabolic syndrome and cardiovascular disease in older people: the Cardiovascular Health Study, J Am Geriatr Soc, 54, 1317, 10.1111/j.1532-5415.2006.00862.x Norman, 2015 Seaman, 1991, New developments in pairwise multiple comparisons: some powerful and practicable procedures, Psychol Bull, 110, 577, 10.1037/0033-2909.110.3.577 Dunn, 1959, Estimation of the medians for dependent variables, Ann Math Stat, 30, 192, 10.1214/aoms/1177706374 Dunn, 1961, Multiple comparisons among means, J Am Stat Assoc, 56, 52, 10.1080/01621459.1961.10482090 Šidák, 1967, Rectangular confidence regions for the means of multivariate normal distributions, J Am Stat Assoc, 62, 626 Benjamini, 1995, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J R Stat Soc B, 57, 289 Holm, 1979, A simple sequentially rejective multiple test procedure, Scand J Stat, 6, 65 Hochberg, 1988, A sharper Bonferroni procedure for multiple tests of significance, Biometrika, 75, 800, 10.1093/biomet/75.4.800 Efron, 1993 Westfall, 1989, p Value adjustments for multiple tests in multivariate binomial models, J Am Stat Assoc, 84, 780 Westfall, 1993 Ge, 2013, Resampling-based multiple testing for microarray data analysis Moyé, 1998, P-value interpretation and alpha allocation in clinical trials, Ann Epidemiol, 8, 351, 10.1016/S1047-2797(98)00003-9 Cormier, 1999, Multiple comparisons: a cautionary tale about the dangers of fishing expeditions, Nutrition, 15, 332 Blakesley, 2009, Comparisons of methods for multiple hypothesis testing in neuropsychological research, Neuropsychology, 23, 255, 10.1037/a0012850 Rothman, 1990, No adjustments are needed for multiple comparisons, Epidemiology, 1, 43, 10.1097/00001648-199001000-00010 Cohen, 1994, The earth is round (p<. 05), Am Psychol, 49, 997, 10.1037/0003-066X.49.12.997 Schulz, 2005, Multiplicity in randomised trials I: endpoints and treatments, Lancet, 365, 1591, 10.1016/S0140-6736(05)66461-6 Bozzetti, 2000, Perioperative total parenteral nutrition in malnourished, gastrointestinal cancer patients: a randomized, clinical trial, JPEN J Parenter Enteral Nutr, 24, 7, 10.1177/014860710002400107 Altman, 2000, Statistics in medical journals: some recent trends, Stat Med, 19, 3275, 10.1002/1097-0258(20001215)19:23<3275::AID-SIM626>3.0.CO;2-M Altman, 1985, Comparability of randomised groups, Statistician, 34, 125, 10.2307/2987510 Roberts, 1999, Baseline imbalance in randomised controlled trials, BMJ, 319, 185, 10.1136/bmj.319.7203.185 Marti-Carvajal, 2010, Taking aim at a moving target: when a study changes in the middle, 299 Craig, 2000, Developing and evaluating complex interventions: new guidance, Medical Research Council Ludwig, 2011, Neighborhoods, obesity, and diabetes—a randomized social experiment, N Engl J Med, 365, 1509, 10.1056/NEJMsa1103216 Cohen, 1968, Multiple regression as a general data-analytic system, Psychol Bull, 70, 426, 10.1037/h0026714 Cronbach, 1957, The two disciplines of scientific psychology, Am Psychol, 12, 671, 10.1037/h0043943 Armitage, 1969, Repeated significance tests on accumulating data, J R Stat Soc Ser A-G, 132, 235, 10.2307/2343787 Coffey, 2015, Statistical concepts for the stroke community: you may have worked on more adaptive designs than you think, Stroke, 46, e26, 10.1161/STROKEAHA.114.004288 Meinert, 1970, A study of the effects of hypoglycemic agents on vascular complications in patients with adult-onset diabetes, Diabetes, 19, 789 Moss, 2002, Prophylactic implantation of a defibrillator in patients with myocardial infarction and reduced ejection fraction, N Engl J Med, 346, 877, 10.1056/NEJMoa013474 Pocock, 1977, Group sequential methods in the design and analysis of clinical trials, Biometrika, 64, 191, 10.1093/biomet/64.2.191 O’Brien, 1979, A multiple testing procedure for clinical trials, Biometrics, 35, 549, 10.2307/2530245 Peto, 1976, Design and analysis of randomized clinical trials requiring prolonged observation of each patient. I. introduction and design, Br J Cancer, 34, 585, 10.1038/bjc.1976.220 Pocock, 1999, Trials stopped early: too good to be true?, Lancet, 353, 943, 10.1016/S0140-6736(98)00379-1 Gelman A Loken E The garden of forking paths: why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. 14 November 2013 [cited 2015 Apr 6]. Available from: http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf. Bowalekar, 2011, Adaptive designs in clinical trials, Perspect Clin Res, 2, 23, 10.4103/2229-3485.76286