Inferential Tools for Assessing Dependence Across Response Categories in Multinomial Models with Discrete Random Effects

Chiara Masci1, Francesca Ieva1, Anna Maria Paganoni1
1MOX - Modelling and Scientific Computing, Department of Mathematics, Politecnico di Milano, 20133, Milan, Italy

Tóm tắt

AbstractWe propose a discrete random effects multinomial regression model to deal with estimation and inference issues in the case of categorical and hierarchical data. Random effects are assumed to follow a discrete distribution with an a priori unknown number of support points. For a K-categories response, the modelling identifies a latent structure at the highest level of grouping, where groups are clustered into subpopulations. This model does not assume the independence across random effects relative to different response categories, and this provides an improvement from the multinomial semi-parametric multilevel model previously proposed in the literature. Since the category-specific random effects arise from the same subjects, the independence assumption is seldom verified in real data. To evaluate the improvements provided by the proposed model, we reproduce simulation and case studies of the literature, highlighting the strength of the method in properly modelling the real data structure and the advantages that taking into account the data dependence structure offers.

Từ khóa


Tài liệu tham khảo

Agresti, A. (2018). An introduction to categorical data analysis An introduction to categorical data analysis. Wiley.

Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models A general maximum likelihood analysis of variance components in generalized linear models. Biometrics, 55(1), 117–128.

Azzimonti, L., Ieva, F., & Paganoni, A. M. (2013). Nonlinear nonparametric mixed-effects models for unsupervised classification Nonlinear nonparametric mixed-effects models for unsupervised classification. Computational Statistics, 28(4), 1549–1570.

Baum, C. F. (2016). Introduction to GSEM in Stata Introduction to gsem in stata. ECON 8823: Applied Econometrics.

Breslow, N. E., & Lin, X. (1995). Bias correction in generalised linear mixed models with a single component of dispersion Bias correction in generalised linear mixed models with a single component of dispersion. Biometrika, 82(1), 81–91.

Caliński, T. & Harabasz, J. (2013). SAS/STAT® 13.1 User’s Guide 13.1 user’s guide. SAS Institute Inc, Cary.

Cary, N. (2015). SAS/STAT® 14.1 User’s Guide. Cary, NC: SAS Institute Inc.

Corp., I. (2021). IBM SPSS Statistics for Windows, Version 28.0 Ibm spss statistics for windows, version 28.0. Released 2021.

Daniels, M. J., & Gatsonis, C. (1997). Hierarchical polytomous regression models with applications to health services research Hierarchical polytomous regression models with applications to health services research. Statistics in Medicine, 16(20), 2311–2325.

De Leeuw, J., Meijer, E., & Goldstein, H. (2008). Handbook of multilevel analysis. Springer.

Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22.

Diggle, P., Diggle, P. J., Heagerty, P., Liang, K.-Y., Heagerty, P. J., Zeger, S., et al. (2002). Analysis of longitudinal data. Oxford University Press.

Goldstein, H. (2011). Multilevel statistical models (vol. 922). John Wiley & Sons.

Goldstein, H., & Rasbash, J. (1996). Improved approximations for multilevel models with binary responses. Journal of the Royal Statistical Society: Series A (Statistics in Society), 159(3), 505–513.

Hadfield, J. D., et al. (2010). MCMC methods for multi-response generalized linear mixed models: The MCMCglmm R package. Journal of Statistical Software, 33(2), 1–22.

Hartzel, J., Agresti, A., & Caffo, B. (2001). Multinomial logit random effects models. Statistical Modelling, 1(2), 81–102.

Hedeker, D., Gibbons, R., du Toit, M., & Cheng, Y. (2008). SuperMix: Mixed effects models. Scientific Software International.

Hedeker, D. (2003). A mixed-effects multinomial logistic regression model. Statistics in Medicine, 22(9), 1433–1446.

Heinen, T. (1996). Latent class and discrete latent trait models: Similarities and differences. Sage Publications, Inc.

King, G. (1989). Unifying political methodology: The likelihood theory of statistical inference. Cambridge University Press.

Kuss, O., & McLerran, D. (2007). A note on the estimation of the multinomial logistic model with correlated responses in SAS. Computer Methods and Programs in Biomedicine, 87(3), 262–269.

Lindsay, B. G. (1983). The geometry of mixture likelihoods: A general theory. The Annals of Statistics, 86–94.

Lindsay, B. G., et al. (1983). The geometry of mixture likelihoods, part II: The exponential family. The Annals of Statistics, 11(3), 783–792.

Long, J. S., & Long, J. S. (1997). Regression models for categorical and limited dependent variables (vol. 7). Sage.

Maggioni, A. (2020). Semi-parametric generalized linear mixed effects model: An application to engineering BSc dropout analysis (Unpublished doctoral dissertation).

Masci, C., Ieva, F., Agasisti, T., & Paganoni, A. M. (2021). Evaluating class and school effects on the joint student achievements in different subjects: A bivariate semiparametric model with random coefficients. Computational Statistics, 1–41.

Masci, C., Ieva, F., & Paganoni, A. M. (2022). Semiparametric multinomial mixed-effects models: A university students profiling tool. The Annals of Applied Statistics, 16(3), 1608–1632.

Masci, C., Paganoni, A. M., & Ieva, F. (2019). Semiparametric mixed effects models for unsupervised classiffication of Italian schools. Journal of the Royal Statistical Society: Series A (Statistics in Society), 182(4), 1313–1342.

McCulloch, C. E., & Searle, S. R. (2001). Generalized, linear, and mixed models (wiley series in probability and statistics).

Meng, X.-L., & Rubin, D. B. (1991). Using EM to obtain asymptotic variance-covariance matrices: The SEM algorithm. Journal of the American Statistical Association, 86(416), 899–909.

Pinheiro, J., & Bates, D. (2006). Mixed-effects models in S and S-PLUS. Springer Science & Business Media.

R Core Team. (2019). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. (https://www.R-project.org/)

R Core Team. (2021). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/

Raudenbush, S. W. (2004). HLM 6: Hierarchical linear and nonlinear modeling. Scientific Software International.

Raudenbush, S. W., Yang, M.-L., & Yosef, M. (2000). Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation. Journal of Computational and Graphical Statistics, 9(1), 141–157.

Rights, J. D., & Sterba, S. K. (2016). The relationship between multilevel models and non-parametric multilevel mixture models: Discrete approximation of intraclass correlation, random coeffecient distributions, and residual heteroscedasticity. British Journal of Mathematical and Statistical Psychology, 69(3), 316–343.

Rodríguez, G., & Goldman, N. (1995). An assessment of estimation procedures for multilevel models with binary responses. Journal of the Royal Statistical Society: Series A (Statistics in Society), 158(1), 73–89.

Spiegelhalter, D., Thomas, A., Best, N., & Lunn, D. (2003). Winbugs user manual. Citeseer.

Steele, F., Steele, F., Kallis, C., Goldstein, H., & Joshi, H. (2005). A multiprocess model for correlated event histories with multiple states, competing risks, and structural effects of one hazard on another. Centre for Multilevel Modelling: http://www.cmm.bristol.ac.uk/research/Multiprocess/mmcehmscrseoha.pdf.

Stroud, A. H., & Secrest, D. (1966). Gaussian quadrature formulas.

Tutz, G., & Hennevogl, W. (1996). Random effects in ordinal regression models. Computational Statistics & Data Analysis, 22(5), 537–557.

Wang, S., & Tsodikov, A. (2010). A self-consistency approach to multinomial logit model with random effects. Journal of Statistical Planning and Inference, 140(7), 1939–1947.

Zhao, Y., Staudenmayer, J., Coull, B. A., &Wand, M. P. (2006). General design Bayesian generalized linear mixed models. Statistical Science, 35–51.