Prediction in Multilevel Generalized Linear Models
Tóm tắt
We discuss prediction of random effects and of expected responses in multilevel generalized linear models. Prediction of random effects is useful for instance in small area estimation and disease mapping, effectiveness studies and model diagnostics. Prediction of expected responses is useful for planning, model interpretation and diagnostics. For prediction of random effects, we concentrate on empirical Bayes prediction and discuss three different kinds of standard errors; the posterior standard deviation and the marginal prediction error standard deviation (comparative standard errors) and the marginal sampling standard deviation (diagnostic standard error). Analytical expressions are available only for linear models and are provided in an appendix. For other multilevel generalized linear models we present approximations and suggest using parametric bootstrapping to obtain standard errors. We also discuss prediction of expectations of responses or probabilities for a new unit in a hypothetical cluster, or in a new (randomly sampled) cluster or in an existing cluster. The methods are implemented in gllamm and illustrated by applying them to survey data on reading proficiency of children nested in schools. Simulations are used to assess the performance of various predictions and associated standard errors for logistic random-intercept models under a range of conditions.
Từ khóa
Tài liệu tham khảo
Adams, 2002, PISA 2000 Technical Report, 99
Afshartous, 2005, Prediction in multilevel models, J. Educ. Behav. Statist., 30, 109, 10.3102/10769986030002109
Afshartous, 2007, Avoiding ‘data snooping’ in multilevel and mixed effects models, J. R. Statist. Soc. A, 170, 1035, 10.1111/j.1467-985X.2007.00494.x
Bartlett, 1938, Methods of estimating mental factors, Nature, 141, 609
Bock, 1981, Marginal maximum likelihood estimation of item parameters: application of an EM algorithm, Psychometrika, 46, 443, 10.1007/BF02293801
Bock, 1982, Adaptive EAP estimation of ability in a microcomputer environment, Appl. Psychol. Measmnt, 6, 431, 10.1177/014662168200600405
Bondeson, 1990, Prediction in random coefficient regression models, Biometr. J., 32, 387, 10.1002/bimj.4710320402
Booth, 1998, Standard errors of prediction in generalized linear mixed models, J. Am. Statist. Ass., 93, 262, 10.1080/01621459.1998.10474107
Breslow, 1993, Approximate inference in generalized linear mixed models, J. Am. Statist. Ass., 88, 9
Candel, 2004, Performance of empirical bayes estimators of random coefficients in multilevel analysis: some results for the random intercept-only model, Statist. Neerland., 58, 197, 10.1046/j.0039-0402.2003.00256.x
Candel, 2007, Empirical bayes estimators of the random intercept in multilevel analysis: performance of the classical, Morris and Rao version, Computnl Statist. Data Anal., 51, 3027, 10.1016/j.csda.2006.01.017
Carlin, 2000, Bayes and Empirical Bayes Methods for Data Analysis
Carlin, 2000, Empirical Bayes: past, present and future, J. Am. Statist. Ass., 95, 1286, 10.1080/01621459.2000.10474331
Chamberlain, 1984, Handbook of Econometrics, vol. II, 1247
Chang, 1993, The asymptotic posterior normality of the latent trait in an IRT model, Psychometrika, 58, 37, 10.1007/BF02294469
Clayton, 1996, Markov Chain Monte Carlo in Practice, 275
Clayton, 1987, Empirical Bayes estimates of age-standardized relative risks for use in disease mapping, Biometrics, 43, 671, 10.2307/2532003
Duchateau, 2005, Understanding heterogeneity in mixed, generalized mixed and frailty models, Am. Statistn, 59, 143, 10.1198/000313005X43236
Efron, 1973, Stein’s estimation rule and its competitors—an empirical Bayes approach, J. Am. Statist. Ass., 68, 117
Efron, 1975, Data analysis using Stein’s estimator and its generalizations, J. Am. Statist. Ass., 70, 311, 10.1080/01621459.1975.10479864
Embretson, 2000, Item Response Theory for Psychologists
Farrell, 1997, Bootstrap adjustments for empirical Bayes interval estimates of small-area proportions, Can. J. Statist., 25, 75, 10.2307/3315358
Ganzeboom, 1992, A standard international socio-economic index of occupational status, Socl Sci. Res., 21, 1, 10.1016/0049-089X(92)90017-B
Gibbons, 1994, A random-effects probit model for predicting medical malpractice claims, J. Am. Statist. Ass., 89, 760, 10.1080/01621459.1994.10476809
Goldberger, 1962, Best linear unbiased prediction in the generalized linear regression model, J. Am. Statist. Ass., 57, 369, 10.1080/01621459.1962.10480665
Goldstein, 1995, Multilevel Statistical Models
Goldstein, 2003, Multilevel Statistical Models
Goldstein, 1996, League tables and their limitations: statistical issues in comparisons of institutional performance, J. R. Statist. Soc. A, 159, 385, 10.2307/2983325
Harville, 1976, Extension of the Gauss-Markov theorem to include the estimation of random effects, Ann. Statist., 2, 384
Hoijtink, 1995, Rasch Models: Foundations, Recent Developments, and Applications, 53, 10.1007/978-1-4612-4230-7_4
Jiang, 2007, Linear and Generalized Linear Mixed Models and Their Applications
Jiang, 2001, Empirical best prediction for small area inference with binary data, Ann. Inst. Statist. Math., 53, 217, 10.1023/A:1012410420337
Kackar, 1984, Approximations for standard errors of estimators of fixed and random effects in mixed linear models, J. Am. Statist. Ass., 79, 853
Kass, 1989, Approximate Bayesian inference in conditionally independent hierarchical models (parametric empirical Bayes models), J. Am. Statist. Ass., 84, 717, 10.1080/01621459.1989.10478825
Laird, 1987, Empirical Bayes confidence intervals based on bootstrap samples (with discussion), J. Am. Statist. Ass., 82, 739, 10.1080/01621459.1987.10478490
Lange, 1989, Assessing normality in random effects models, Ann. Statist., 17, 624, 10.1214/aos/1176347130
Langford, 1998, Outliers in multilevel data (with discussion), J. R. Statist. Soc. A, 161, 121, 10.1111/1467-985X.00094
Lawley, 1971, Factor Analysis as a Statistical Method
Lindley, 1972, Bayes estimates for the linear model (with discussion), J. R. Statist. Soc. B, 34, 1
Longford, 2001, Simulation-based diagnostics in random-coefficient models, J. R. Statist. Soc. A, 164, 259, 10.1111/1467-985X.00201
Louis, 1984, Bayes and empirical Bayes estimates of a population of parameter values, J. Am. Statist. Ass., 79, 393, 10.1080/01621459.1984.10478062
Ma, 2008, Multilevel Modelling of Educational Data, 59
Maritz, 1989, Empirical Bayes Methods
McCulloch, 1997, Maximum likelihood algorithms for generalized linear mixed models, J. Am. Statist. Ass., 92, 162, 10.1080/01621459.1997.10473613
McCulloch, 2007, Prediction of random effects and effects of misspecification of their distribution
McCulloch, 2008, Generalized, Linear and Mixed Models
Mislevy, 1986, Recent developments in the factor analysis of categorical variables, J. Educ. Statist., 11, 3, 10.3102/10769986011001003
Morris, 1983, Parametric empirical Bayes inference: theory and applications, J. Am. Statist. Ass., 78, 47, 10.1080/01621459.1983.10477920
Organisation for Economic Co-operation and Development, 2000, Manual for the PISA 2000 Database
Pinheiro, 1995, Approximations to the log-likelihood function in the nonlinear mixed-effects model, J. Computnl Graph. Statist., 4, 12
Rabe-Hesketh, 2003, Correcting for covariate measurement error in logistic regression using nonparametric maximum likelihood estimation, Statist. Modllng, 3, 215, 10.1191/1471082X03st056oa
Rabe-Hesketh, 2006, Multilevel modelling of complex survey data, J. R. Statist. Soc. A, 169, 805, 10.1111/j.1467-985X.2006.00426.x
Rabe-Hesketh, 2008, Longitudinal Data Analysis, 79
Rabe-Hesketh, 2008, Multilevel and Longitudinal Modeling using Stata
Rabe-Hesketh, 2004, Generalized multilevel structural equation modeling, Psychometrika, 69, 167, 10.1007/BF02295939
Rabe-Hesketh, 2005, Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects, J. Econometr., 128, 301, 10.1016/j.jeconom.2004.08.017
Rao, 1975, Simultaneous estimation of parameters in different linear models and applications to biometric problems, Biometrics, 31, 545, 10.2307/2529436
Raudenbush, 2002, Hierarchical Linear Models
Reinsel, 1984, Estimation and prediction in a multivariate random effects generalized linear model, J. Am. Statist. Ass., 79, 406, 10.1080/01621459.1984.10478064
Reinsel, 1985, Mean squared error properties of empirical Bayes estimators in a multivariate random effects general linear model, J. Am. Statist. Ass., 80, 642, 10.1080/01621459.1985.10478164
Robbins, 1955, Proc. 3rd Berkeley Symp. Mathematical Statistics and Probability, 157
Robinson, 1991, That BLUP is a good thing: the estimation of random effects, Statist. Sci., 6, 15
Rose, 2006, A multilevel approach to individual tree survival prediction, For. Sci., 52, 31
Rosenberg, 1973, Linear regression with randomly dispersed parameters, Biometrika, 60, 65, 10.1093/biomet/60.1.65
Rubin, 1980, Using empirical Bayes techniques in the law school validity studies, J. Am. Statist. Ass., 75, 801, 10.1080/01621459.1980.10477553
Rubin, 1984, Bayesianly justifiable and relevant frequency calculations for the applied statistician, Ann. Statist., 12, 1151, 10.1214/aos/1176346785
Rumberger, 2005, Does segregation still matter? The impact of student composition on academic achievement in high school, Teach. Coll. Rec., 107, 1999, 10.1177/016146810810700905
Schilling, 2005, High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature, Psychometrika, 70, 533
Skrondal, 1996, Latent Trait, Multilevel and Repeated Measurement Modelling with Incomplete Data of Mixed Measurement Levels
Skrondal, 2004, Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models, 10.1201/9780203489437
Skrondal, 2007, Redundant overdispersion parameters in multilevel models, J. Educ. Behav. Statist., 32, 419, 10.3102/1076998607302629
Skrondal, 2007, Latent variable modelling: a survey, Scand. J. Statist., 34, 712, 10.1111/j.1467-9469.2007.00573.x
Smith, 1973, A general Bayesian linear model, J. R. Statist. Soc. B, 35, 67
Strenio, 1983, Empirical Bayes estimation of individual growth curve parameters and their relations to covariates, Biometrics, 39, 71, 10.2307/2530808
Swamy, 1970, Efficient inference in a random coefficient regression model, Econometrica, 38, 311, 10.2307/1913012
Ten Have, 1999, Empirical Bayes estimation of random effects parameters in mixed effects logistic regression models, Biometrics, 55, 1022, 10.1111/j.0006-341X.1999.01022.x
Thomson, 1938, The Factorial Analysis of Human Ability
Thurstone, 1935, The Vectors of Mind
Tsutakawa, 1990, The effect of uncertainty of item parameter estimation on ability estimates, Psychometrika, 55, 371, 10.1007/BF02295293
Vidoni, 2006, Response prediction in mixed effects models, J. Statist. Planng Inf., 136, 3948, 10.1016/j.jspi.2005.03.006
Vonesh, 1997, Linear and Nonlinear Models for the Analysis of Repeated Measurements
Ware, 1981, Tracking: prediction of future values from serial measurements, Biometrics, 37, 427, 10.2307/2530556
Warm, 1989, Weighted likelihood estimation of ability in item response models, Psychometrika, 54, 427, 10.1007/BF02294627