A cautionary case study of approaches to the treatment of missing data

Journal of the Italian Statistical Society - Tập 17 Số 3 - Trang 351-372 - 2008
Christopher Paul1, William M. Mason2, Daniel F. McCaffrey1, Sarah Fox3
1RAND, 4570 Fifth Ave., Suite 600, Pittsburgh, PA, 15213, USA
2California Center for Population Research, University of California, Los Angeles, 4284 Public Policy Building, PO Box 951484, Los Angeles, CA, 90095, USA
3Department of Medicine, Division of General Internal Medicine and Health Services Research, University of California, Los Angeles, 1100 Glendon Ave., Suite 2010, Los Angeles, CA, 90024, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Allison PD (2001) Missing data. Sage Publications, Thousand Oaks

Ambler G, Omar RZ (2007) A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Stat Methods Med Res 16: 277–298

Anderson AB, Basilevsky A, Hum DPJ (1983) Missing data: a review of the literature. In: Rossi, Wright, Anderson (eds) Handbook of survey research. Academic Press, New York

Breen N, Kessler L (1994) Changes in the use of screening mammography: evidence from the 1987 and 1990 National Health Interview Surveys. Am J Public Health 84: 62–72

Brick JM, Kalton G (1996) Handling missing data in survey research. Stat Methods Med Res 5: 215–238

Carpenter JR, Kenward MG, White IR (2007) Sensitivity analysis after multiple imputation under missing at random: a weighting approach. Stat Methods Med Res 16: 259–275

Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall, New York

Farewell VT (1979) Some results on the estimation of logistic models based on retrospective data. Biometrika 66: 533–538

Fox J (1997) Applied regression analysis, linear models, and related methods. Sage Publications, Thousand Oaks

Fox SA, Siu AL, Stein JA (1994) The importance of physician communication on breast-cancer screening of older women. Arch Intern Med 154: 2058–2068

Fox SA, Pitkin K, Paul C, Carson S, Duan N (1998) Breast cancer screening adherence: does church attendance matter?. Health Educ Behav 25: 742–758

Groves RM, Singer E, Corning A (2000) Leverage–Saliency theory of survey participation. Public Opin Q 64: 299–308

Heckman J (1976) The common structure of statistical models of truncation, sample selection, and limited dependent variables, and a simple estimator for such models. Ann Econ Soc Meas 5: 475–492

Heckman J (1979) Sample selection bias as a specification error. Econometrica 47: 153–161

Jones MP (1996) Indicator and stratification methods for missing explanatory variables in multiple linear regression. J Am Stat Assoc 91: 222–230

Landerman LR, Land KC, Pieper CF (1997) An empirical evaluation of the predictive mean matching method for imputing missing values. Sociol Methods Res 26: 3–33

Little RJA (1992) Regression with missing X’s: a review. J Am Stat Assoc 87: 1227–1238

Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York

McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, New York

Rao JNK, Shao J (1992) Jackknife variance estimation with survey data under hot deck imputation. Biometrika 79: 811–822

Royston P (2004) Multiple imputation of missing values. Stata J 4: 227–241

Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York

Rubin DB (1996) Multiple imputation after 18+ years. J Am Stat Assoc 91: 473–489

Rubin DB, Schenker N (1986) Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. J Am Stat Assoc 81: 366–374

Rubin DB, Schenker N (1991) Multiple imputation in health-care databases: an overview and some applications. Stat Med 10: 585–598

Schafer JL (1997a) Analysis of incomplete multivariate data. Chapman & Hall, London

Schafer JL (1997b) Software for multiple imputation. [ http://www.stat.psu.edu/~jls/misoftwa.html]

Tanner MA, Wong WH (1987) The calculation of posterior distributions by data augmentation (with discussion). J Am Stat Assoc 82: 528–550

Vach W (1994) Logistic regression with missing values in the covariates. Springer, New York

Xie Y, Manski CF (1989) The logit model and response-based samples. Sociol Methods Res 17: 283–302