Review of inverse probability weighting for dealing with missing data

Statistical Methods in Medical Research - Tập 22 Số 3 - Trang 278-295 - 2013
Shaun R. Seaman1, Ian R. White1
1MRC Biostatistics Unit, Institute of Public Health, Forvie Site, Robinson Way, Cambridge, UK

Tóm tắt

The simplest approach to dealing with missing data is to restrict the analysis to complete cases, i.e. individuals with no missing values. This can induce bias, however. Inverse probability weighting (IPW) is a commonly used method to correct this bias. It is also used to adjust for unequal sampling fractions in sample surveys. This article is a review of the use of IPW in epidemiological research. We describe how the bias in the complete-case analysis arises and how IPW can remove it. IPW is compared with multiple imputation (MI) and we explain why, despite MI generally being more efficient, IPW may sometimes be preferred. We discuss the choice of missingness model and methods such as weight truncation, weight stabilisation and augmented IPW. The use of IPW is illustrated on data from the 1958 British Birth Cohort.

Từ khóa


Tài liệu tham khảo

10.1007/s00127-005-0882-5

10.2307/1403631

10.1177/096228029600500303

10.1027/1614-2241/a000005

10.1093/aje/kwi320

10.1007/s10654-006-9008-y

10.1016/j.drugalcdep.2008.01.014

10.1002/sim.3482

10.1001/archpsyc.64.6.668

10.1002/sim.1821

10.1016/j.annepidem.2005.10.006

10.1136/oem.2003.007294

10.1111/j.1467-985X.2006.00399.x

10.1093/aje/kwi271

10.1016/j.psyneuen.2008.01.017

10.1192/bjp.185.4.306

10.1002/bimj.200310049

10.1136/oem.2007.036640

Stansfeld SA, 2008, Drug Alcohol Depend, 95, 269, 10.1016/j.drugalcdep.2008.01.014

10.1186/1471-2288-7-51

10.1198/000313002753631330

10.1201/9781439821862

10.1002/(SICI)1097-0258(19970115)16:1<39::AID-SIM535>3.0.CO;2-D

10.1037/1082-989X.6.4.330

10.1016/j.csda.2007.04.020

10.1111/j.1467-985X.2006.00407.x

10.1214/07-STS227D

10.1198/000313007X172556

10.1177/0962280206075304

10.1111/j.1467-9531.2009.01215.x

10.1177/0962280206074463

Goldstein H, 2009, Longit Life Course Stud, 1, 63

10.1111/j.1467-9868.2006.00546.x

Seaman SR, Combining multiple imputation and inverse-probability weighting

Tsiatis AA. Semiparametric theory and missing data, New York: Springer, 2006, pp. 206–207.

Kalton G, 2003, J Off Stat, 19, 81

10.1016/j.jclinepi.2007.07.011

10.1093/aje/kwj149

10.1214/07-STS227

Hosmer DW, 1989, Applied logistic regression

10.1093/biomet/72.3.487

10.1080/01621459.1997.10474004

10.1002/0470090456.ch21

10.1214/07-STS227C

Folsom RE, 1991, Proceedings of the American Statistical Association, Social Statistics Section, 197

10.1093/biomet/asp033

10.1016/j.csda.2004.08.013

10.1093/aje/kwi208

10.1111/j.2517-6161.1996.tb02080.x

10.1093/aje/kwn164

10.1097/00001648-200009000-00011

10.1016/j.jspi.2008.10.024

10.1111/j.1541-0420.2005.00377.x

10.1007/978-1-4899-4541-9

10.2337/dc06-1881

10.1136/jech.2006.058966

10.1080/01621459.1995.10476493

10.1080/01621459.1998.10473795

10.1191/0962280204sm351ra

10.1093/biomet/87.1.113

10.1002/(SICI)1097-0258(19970115)16:1<5::AID-SIM469>3.0.CO;2-8

10.1080/01621459.1997.10473653

10.1093/biomet/63.3.581