Globally Efficient Non-Parametric Inference of Average Treatment Effects by Empirical Balancing Calibration Weighting

Kwun Chuen Gary Chan1, Sheung Chi Phillip Yam2, Zheng Zhang2
1University of Washington SEATTLE USA
2Chinese University of Hong Kong , People's Republic of China

Tóm tắt

SummaryThe estimation of average treatment effects based on observational data is extremely important in practice and has been studied by generations of statisticians under different frameworks. Existing globally efficient estimators require non-parametric estimation of a propensity score function, an outcome regression function or both, but their performance can be poor in practical sample sizes. Without explicitly estimating either function, we consider a wide class of calibration weights constructed to attain an exact three-way balance of the moments of observed covariates among the treated, the control and the combined group. The wide class includes exponential tilting, empirical likelihood and generalized regression as important special cases, and extends survey calibration estimators to different statistical problems and with important distinctions. Global semiparametric efficiency for the estimation of average treatment effects is established for this general class of calibration estimators. The results show that efficiency can be achieved by solely balancing the covariate distributions without resorting to direct estimation of the propensity score or outcome regression function. We also propose a consistent estimator for the efficient asymptotic variance, which does not involve additional functional estimation of either the propensity score or the outcome regression functions. The variance estimator proposed outperforms existing estimators that require a direct approximation of the efficient influence function.

Từ khóa


Tài liệu tham khảo

Abadie, 2006, Large sample properties of matching estimators for average treatment effects, Econometrica, 74, 235, 10.1111/j.1468-0262.2006.00655.x

Bang, 2005, Doubly robust estimation in missing data and causal inference models, Biometrics, 61, 962, 10.1111/j.1541-0420.2005.00377.x

Blinder, 1973, Wage discrimination: reduced form and structural estimates, J. Hum. Resour., 8, 436, 10.2307/144855

Cao, 2009, Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data, Biometrika, 96, 723, 10.1093/biomet/asp033

Chan, 2012, Uniform improvement of empirical likelihood for missing response problem, Electron. J. Statist., 6, 289, 10.1214/12-EJS673

Chan, 2013, A simple multiply robust estimator for missing response problem, Stat, 2, 143, 10.1002/sta4.24

Chan, 2014, Oracle, multiple robust and multipurpose calibration in a missing response problem, Statist. Sci., 29, 380, 10.1214/13-STS461

Chen, 2008, Semiparametric efficiency in gmm models with auxiliary data, Ann. Statist., 36, 808, 10.1214/009053607000000947

Chen, 2013, Mann–Whitney test with adjustments to pretreatment variables for missing values and observational study, J. R. Statist. Soc. B, 75, 81, 10.1111/j.1467-9868.2012.01036.x

Cheng, 1994, Nonparametric estimation of mean functionals with data missing at random, J. Am. Statist. Ass., 89, 81, 10.1080/01621459.1994.10476448

Crainiceanu, 2008, Adjustment uncertainty in effect estimation, Biometrika, 95, 635, 10.1093/biomet/asn015

Dehejia, 1999, Causal effects in nonexperimental studies: reevaluating the evaluation of training programs, J. Am. Statist. Ass., 94, 1053, 10.1080/01621459.1999.10473858

Deming, 1940, On a least squares adjustment of a sampled frequency table when the expected marginal totals are known, Ann. Math. Statist., 11, 427, 10.1214/aoms/1177731829

Deville, 1992, Calibration estimators in survey sampling, J. Am. Statist. Ass., 87, 376, 10.1080/01621459.1992.10475217

Graham, 2012, Inverse probability tilting for moment condition models with missing data, Rev. Econ. Stud., 79, 1053, 10.1093/restud/rdr047

Hahn, 1998, On the role of the propensity score in efficient semiparametric estimation of average treatment effects, Econometrica, 66, 315, 10.2307/2998560

Hainmueller, 2012, Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies, Polit. Anal., 20, 25, 10.1093/pan/mpr025

Han, 2013, Estimation with missing data: beyond double robustness, Biometrika, 100, 417, 10.1093/biomet/ass087

Hansen, 1996, Finite-sample properties of some alternative GMM estimators, J. Bus. Econ. Statist., 14, 262, 10.1080/07350015.1996.10524656

Hellerstein, 1999, Imposing moment restrictions from auxiliary data by weighting, Rev. Econ. Statist., 81, 1, 10.1162/003465399557860

Hirano, 2003, Efficient estimation of average treatment effects using the estimated propensity score, Econometrica, 71, 1161, 10.1111/1468-0262.00442

Horvitz, 1952, A generalization of sampling without replacement from a finite universe, J. Am. Statist. Ass., 47, 663, 10.1080/01621459.1952.10483446

Imai, 2014, Covariate balancing propensity score, J. R. Statist. Soc. B, 76, 243, 10.1111/rssb.12027

Imbens, 2006, Mean-squared-error calculations for average treatment effects, 10.2139/ssrn.954748

Imbens, 1998, Information theoretic approaches to inference in moment condition models, Econometrica, 66, 333, 10.2307/2998561

Kang, 2007, Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data, Statist. Sci., 22, 523

Kim, 2010, Calibration estimation in survey sampling, Int. Statist. Rev., 78, 21, 10.1111/j.1751-5823.2010.00099.x

Kitamura, 1997, An information-theoretic alternative to generalized method of moments estimation, Econometrica, 65, 861, 10.2307/2171942

Lalonde, 1986, Evaluating the econometric evaluations of training programs, Am. Econ. Rev., 76, 604

Maity, 2007, Efficient estimation of population-level summaries in general semiparametric regression models, J. Am. Statist. Ass., 102, 123, 10.1198/016214506000001103

Newey, 2004, Higher order properties of gmm and generalized empirical likelihood estimators, Econometrica, 72, 219, 10.1111/j.1468-0262.2004.00482.x

Oaxaca, 1973, Male-female wage differentials in urban labor markets, Int. Econ. Rev., 14, 693, 10.2307/2525981

Owen, 1988, Empirical likelihood ratio confidence intervals for a single functional, Biometrika, 75, 237, 10.1093/biomet/75.2.237

Qin, 1994, Empirical likelihood and general estimating equations, Ann. Statist., 22, 300, 10.1214/aos/1176325370

Qin, 2007, Empirical-likelihood-based inference in missing response problems and its application in observational studies, J. R. Statist. Soc. B, 69, 101, 10.1111/j.1467-9868.2007.00579.x

Ridgeway, 2007, Comment: Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data, Statist. Sci., 22, 540, 10.1214/07-STS227C

Robins, 1994, Estimation of regression coefficients when some regressors are not always observed, J. Am. Statist. Ass., 89, 846, 10.1080/01621459.1994.10476818

Rosenbaum, 1987, Model-based direct adjustment, J. Am. Statist. Ass., 82, 387, 10.1080/01621459.1987.10478441

Rosenbaum, 1991, A characterization of optimal designs for observational studies, J. R. Statist. Soc. B, 53, 597, 10.1111/j.2517-6161.1991.tb01848.x

Rosenbaum, 1983, The central role of the propensity score in observational studies for causal effects, Biometrika, 70, 41, 10.1093/biomet/70.1.41

Rosenbaum, 1984, Reducing bias in observational studies using subclassification on the propensity score, J. Am. Statist. Ass., 79, 516, 10.1080/01621459.1984.10478078

Rosenbaum, 1985, Constructing a control group using multivariate matched sampling methods that incorporate the propensity score, Am. Statistn, 39, 33, 10.1080/00031305.1985.10479383

Rubin, 2007, The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials, Statist. Med., 26, 20, 10.1002/sim.2739

Schennach, 2007, Point estimation with exponentially tilted empirical likelihood, Ann. Statist, 35, 634, 10.1214/009053606000001208

Tan, 2010, Bounded, efficient and doubly robust estimation with inverse weighting, Biometrika, 97, 661, 10.1093/biomet/asq035

Tseng, 1987, Relaxation methods for problems with strictly convex separable costs and linear constraints, Math. Programmng, 38, 303, 10.1007/BF02592017

Vansteelandt, 2012, On model selection and model misspecification in causal inference, Statist. Meth. Med. Res., 21, 7, 10.1177/0962280210387717