Scaling regression inputs by dividing by two standard deviations

Statistics in Medicine - Tập 27 Số 15 - Trang 2865-2873 - 2008
Andrew Gelman1
1Department of Statistics and Department of Political Science, Columbia University, New York, NY, U.S.A.

Tóm tắt

Abstract

Interpretation of regression coefficients is sensitive to the scale of the inputs. One method often used to place input variables on a common scale is to divide each numeric variable by its standard deviation. Here we propose dividing each numeric variable by two times its standard deviation, so that the generic comparison is with inputs equal to the mean ±1 standard deviation. The resulting coefficients are then directly comparable for untransformed binary predictors. We have implemented the procedure as a function in R. We illustrate the method with two simple analyses that are typical of applied modeling: a linear regression of data from the National Election Study and a multilevel logistic regression of data on the prevalence of rodents in New York City apartments. We recommend our rescaling as a default option—an improvement upon the usual approach of including variables in whatever way they are coded in the data file—so that the magnitudes of coefficients can be directly compared as a matter of routine statistical practice. Copyright © 2007 John Wiley & Sons, Ltd.

Từ khóa


Tài liệu tham khảo

10.2307/2684719

10.2307/2111095

10.2307/2090571

10.1093/oxfordjournals.aje.a114229

RossCE. Work Family and Well‐being in the United States 1990. Survey data available from Inter‐university Consortium for Political and Social Research Ann Arbor MI.

R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing: Vienna Austria 2007.www.R‐project.org[accessed 23 August2007].

Gelman A, 2007, Data Analysis Using Regression and Multilevel/Hierarchical Models

MillerWE KinderDR RosenstoneSJ. American National Election Study 1992. Survey data available from Inter‐university Consortium for Political and Social Research Ann Arbor MI.

McCarty N, 2006, Polarized America: The Dance of Political Ideology and Unequal Riches

10.1561/100.00006026

10.1111/j.1467-9531.2007.00181.x

Hastie TJ, 1990, Generalized Additive Models

Lubsen J, 1978, A practical device for the application of a diagnostic or prognostic function, Methods of Information in Medicine, 17, 127, 10.1055/s-0038-1636613

10.1007/978-1-4757-3462-1

10.1145/1081870.1081886

10.1056/NEJMoa054408

10.1016/S0140-6736(06)69895-4

10.1257/aer.96.4.1013