Information Sets and Excess Zeros in Random Effects Modeling of Longitudinal Data

Statistics in Biosciences - Tập 2 - Trang 81-94 - 2010
Tze L. Lai1, Kevin H. Sun1, Samuel P. Wong2
1Department of Statistics, Stanford University, Stanford, USA
2Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China

Tóm tắt

Marginal regression via generalized estimating equations is widely used in biostatistics to model longitudinal data from subjects whose outcomes and covariates are observed at several time points. In this paper we consider two issues that have been raised in the literature concerning the marginal regression approach. The first is that even though the past history may be predictive of outcome, the marginal approach does not use this history. Although marginal regression has the flexibility of allowing between-subject variations in the observation times, it may lose substantial prediction power in comparison with the transitional modeling approach that relates the responses to the covariate and outcome histories. We address this issue by using the concept of “information sets” for prediction to generalize the “partly conditional mean” approach of Pepe and Couper (J. Am. Stat. Assoc. 92:991–998, 1997). This modeling approach strikes a balance between the flexibility of the marginal approach and the predictive power of transitional modeling. Another issue is the problem of excess zeros in the outcomes over what the underlying model for marginal regression implies. We show how our predictive modeling approach based on information sets can be readily modified to handle the excess zeros in the longitudinal time series. By synthesizing the marginal, transitional, and mixed effects modeling approaches in a predictive framework, we also discuss how their respective advantages can be retained while their limitations can be circumvented for modeling longitudinal data.

Tài liệu tham khảo

Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Proceedings of the 2nd international symposium on information theory. Akademiai Kiado, Budapest, pp 267–281 Anderson TW, Hsiao C (1981) Estimation of dynamic models with error components. J Am Stat Assoc 76:598–606 Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25 Chaganty NR, Joe H (2004) Efficiency of generalized estimating equations for binary responses. J R Stat Soc Ser B 66:851–860 Diggle PJ, Liang KY, Zeger SL (1994) Analysis of longitudinal data. Oxford University Press, New York Frees EW, Young VR, Luo Y (2001) Case studies using panel data models. N Am Actuar J 5:24–42 Hall DB (2000) Zero-inflated Poisson and binomial regression with random effects: a case study. Biometrics 56:1030–1039 Hasan MT, Sneddon G, Ma R (2009) Pattern-mixture zero inflated mixed models for longitudinal unbalanced count data with excessive zeros. Biom J 51:946–960 Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, New York Lai TL, Lee CP (1997) Information and prediction criteria for model selection in stochastic regression and ARMA models. Stat Sin 7:285–309 Lai TL, Shih MC (2003) Nonparametric estimation in nonlinear mixed effects models. Biometrika 90:1–13 Lai TL, Shih MC (2003) A hybrid estimator in nonlinear and generalised linear mixed effects models. Biometrika 90:859–879 Lai TL, Shih MC, Wong SP (2006) A new approach to modeling covariate effects and individualization in population pharmacokinetics–pharmacodynamics. J Pharmacokinet Pharmacodyn 33:49–74 Lai TL, Small D (2007) Marginal regression analysis of longitudinal data with time-dependent covariates: a generalized method-of-moments approach. J R Stat Soc Ser B 69:79–99 Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14 Lee Y, Nelder JA (1996) Hierarchical generalized linear models (with discussion). J R Stat Soc Ser B 58:619–678 AH Lee, Wang K, Scott JA, et al. (2006) Multi-level zero-inflated Poisson regression modelling of correlated count data with excess zeros. Stat Methods Med Res 15:47–61 Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22 Lu S, Lin Y, Shih WJ (2004) Analyzing excessive no changes in clinical trials with clustered data. Biometrics 60:257–267 Min Y, Agresti A (2005) Random effect models for repeated measures of zero-inflated count data. Stat Model 5:1–19 Pepe MS, Couper D (1997) Modeling partly conditional means with longitudinal data. J Am Stat Assoc 92:991–998 Pepe MS, Anderson GL (1994) A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data. Commun Stat Simul Comput 23:939–951 Rissanen J (1986) Stochastic complexity and modeling. Ann Stat 14:1080–1100 Schildcrout JS, Heagerty PJ (2005) Regression analysis of longitudinal binary data with time-dependent environmental covariates: bias and efficiency. Biostatistics 6:633–652 Wei CZ (1992) On predictive least squares principles. Ann Stat 20:1–42 Yau KK, Lee AH (2001) Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention. Stat Med 20:2907–2920 Yau KK, Lee AH, Ng AS (2002) A zero-augmented Gamma mixed model for longitudinal data with many zeros. Aust N Z J Stat 44:177–183 Zeger SL, Liang KY (1992) An overview of methods for analysis of longitudinal data. Stat Med 11:1825-1839 Zeger SL, Qaqish B (1988) Markov regression models for time series: a quasi-likelihood approach. Biometrics 44:1019–1031 Zhang P (1997) Comment on “An asymptotic theory for linear model selection” by J Shao. Stat Sin 7:254–258