Dichotomizing continuous predictors in multiple regression: a bad idea

Statistics in Medicine - Tập 25 Số 1 - Trang 127-141 - 2006
Patrick Royston1, Douglas G. Altman2, Willi Sauerbrei3
1MRC Clinical Trials Unit, 222 Euston Road, London NW1 2DA, UK
2Centre for Statistics in Medicine, University of Oxford, Wolfson College Annexe, Linton Road, Oxford OX2 6UD, U.K.
3Institute of Medical Biometry and Medical Informatics, University Hospital of Freiburg, Stefan‐Meier‐Str. 25, 79104 Freiburg, Germany

Tóm tắt

AbstractIn medical research, continuous variables are often converted into categorical variables by grouping values into two or more categories. We consider in detail issues pertaining to creating just two groups, a common approach in clinical research. We argue that the simplicity achieved is gained at a cost; dichotomization may create rather than avoid problems, notably a considerable loss of power and residual confounding. In addition, the use of a data‐derived ‘optimal’ cutpoint leads to serious bias. We illustrate the impact of dichotomization of continuous predictor variables using as a detailed case study a randomized trial in primary biliary cirrhosis. Dichotomization of continuous data is unnecessary for statistical analysis and in particular should not be applied to explanatory variables in regression models. Copyright © 2005 John Wiley & Sons, Ltd.

Từ khóa


Tài liệu tham khảo

10.1037/0033-2909.113.1.181

10.1097/00001648-199507000-00002

Hastie TJ, 1990, Generalized Additive Models

10.2307/2986270

10.1016/S0029-7844(96)00504-2

10.1037/1082-989X.7.1.19

10.1509/jmkr.40.3.366.19237

10.1177/014662168300700301

10.1093/jnci/86.11.829

HarrellJrFE. Problems caused by categorizing continuous variables.http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/CatContinuous 2004. Accessed on 6.9.2004.

10.1002/sim.1687

10.1002/sim.4780070126

Breslow NE, 1980, Statistical Methods in Cancer Research

10.1002/(SICI)1097-0258(19961030)15:20<2203::AID-SIM357>3.0.CO;2-G

10.1002/sim.4780111308

10.2307/2528036

10.1097/00001648-199707000-00014

10.1093/oxfordjournals.aje.a115815

10.2307/2529881

10.1016/0167-9473(95)00016-X

10.1002/(SICI)1097-0258(19960115)15:1<103::AID-SIM156>3.0.CO;2-Y

10.1002/(SICI)1097-0258(19971230)16:24<2813::AID-SIM701>3.0.CO;2-Z

10.1002/sim.1611

10.1002/sim.1333

10.1016/0016-5085(85)90213-6

10.1111/1467-985X.00122

10.1002/sim.1815

10.1002/sim.1621

Hirsch RP, 1991, Validation samples, Biometrics, 47, 1193

10.1097/00001648-199507000-00025

StataCorp, 2003, Stata Reference Manual, Version 8

Sauerbrei W, 2005, Multivariable regression model building by using fractional polynomials: description of SAS, Stata and R programs, Computational Statistics and Data Analysis

10.1002/sim.1310