Testing heterogeneity in quantile regression: a multigroup approach

Computational Statistics - Trang 1-24 - 2023
Cristina Davino1, Giuseppe Lamberti1, Domenico Vistocco2
1Department of Economics and Statistics, University of Naples Federico II, Naples, Italy
2Department of Political Sciences, University of Naples Federico II, Naples, Italy

Tóm tắt

The paper aims to introduce a multigroup approach to assess group effects in quantile regression. The procedure estimates the same regression model at different quantiles, and for different groups of observations. Such groups are defined by the levels of one or more stratification variables. The proposed approach exploits a computational procedure to test group effects. In particular, a bootstrap parametric test and a permutation test are compared through artificial data taking into account different sample sizes, and comparing their performance in detecting low, medium, and high differences among coefficients pertaining different groups. An empirical analysis on MOOC students’ performance is used to show the proposal in action. The effect of the two main drivers impacting on performance, learning and engagement, is explored at different conditional quantiles, and comparing self-paced courses with instructor-paced courses, offered on the EdX platform.

Tài liệu tham khảo

Baye A, Monseur C (2016) Gender differences in variability and extreme scores in an international context. Large Scale Assess Educ 4(1) https://doi.org/10.1186/s40536-015-0015-x Carannante M, Davino C, Vistocco D (2020) Modelling students’ performance in MOOCs: a multivariate approach. Stud High Educ 32:453–468. https://doi.org/10.1080/03075079.2020.1723526 Chin W, Dibbern J (2010) An introduction to a permutation based procedure for multi-group pls analysis: results of tests of differences on simulated data and a cross cultural analysis of the sourcing of information system services between germany and the usa. In: Vinzi VE, Chin W, Henseler J et al (eds) Handbook of Partial Least Squares. Springer Handbooks of Computational Statistics, Springer, Berlin, Heidelberg, pp 171–193 Chow G (1960) Test of equality between sets of coefficients in two linear regressions. Econometrica 28:591–605. https://doi.org/10.2307/1910133 Davino C, Romano R, Vistocco D (2022) Handling multicollinearity in quantile regression through the use of principal component regression. METRON 80:150–174. https://doi.org/10.1007/s40300-022-00230-3 Davino C, Furno M, Vistocco D (2013) Quantile Regression: Theory and Applications. John Wiley & Sons de Barba P, Kennedy G, Ainley M (2016) The role of students’ motivation and participation in predicting performance in a MOOC. J Comput Assist Learn 32:218–231. https://doi.org/10.1111/jcal.12130 Efron B, Tibshirani R (1998) Introduction to the Bootstrap. Chapman & Hall Eslami A, Qannari E, Kohler A et al (2013) General overview of methods of analysis of multi-group datasets. RNTI 25:113–128 Fianu E, Blewett C, Ampong G et al (2018) Factors affecting MOOC usage by students in selected Ghanaian universities. Edu Sci 8(2):70 Furno M, Vistocco D (2018) Quantile Regression: Estimation and simulation. John Wiley & Sons Gelman A (2006) Multilevel (hierarchical) modeling: what it can and cannot do. Technometrics 48(3):432–435. https://doi.org/10.1198/004017005000000661 Goopio J, Cheung C (2020) The MOOC dropout phenomenon and retention strategies. J Teach Travel Tour 21(2):177–97. https://doi.org/10.1080/15313220.2020.1809050 Gujarati D (1970) Use of dummy variables in testing for equality between sets of coefficients in two linear regressions: a note. Am Stat 24(1):50–52. https://doi.org/10.2307/2682300 Hair JJ, Sarstedt M, Ringle C et al (2012) An assessment of the use of partial least squares structural equation modeling in marketing research. J Acad Mark Sci 40(1):414–433. https://doi.org/10.1007/s11747-011-0261-6 Hair JJ, Hult G, Ringle C et al (2016) A primer on partial least squares structural equation modeling (PLS-SEM). Sage publications, Los Angeles Hair JJ, Sarstedt M, Ringle C et al (2018) Advanced issues in partial least squares structural equation modeling. Sage publications, Los Angeles Hansen KY, Gustafsson J (2016) Determinants of country differences in effects of parental education on children’s academic achievement. Large-scale assess educ 4(1):1–13. https://doi.org/10.1186/s40536-016-0027-1 Hintze J, Nelson R (1998) Violin plots: a box plot-density trace synergism. Am Stat 52:181–184. https://doi.org/10.1080/00031305.1998.10480559 Keil M, Tan B, Wei K et al (2000) A cross-cultural study on escalation of commitment behavior in software projects. MIS Q 24(2):181–184. https://doi.org/10.2307/3250940 Kherad-Pajouh S, Renaud O (2010) An exact permutation method for testing any effect in balanced and unbalanced fixed effect ANOVA. Comput Stat Data Anal 54(7):1881–1893. https://doi.org/10.1016/j.csda.2010.02.015 Kleiner A, Talwalkar A, Sarkar P et al (2014) A scalable bootstrap for massive data. J R Stat Soc Ser B Statl Methodol 76(4):795–816 Kocherginsky M, He X, Mu Y (2005) Practical confidence intervals for regression quantiles. J Comput Graph Stat 14:41–55 Koenker R (2022) Quantreg: quantile regression. R Packag Vers 5:94 Koenker R, Bassett J (1978) Regression quantiles. Econometrica pp 33–50. https://doi.org/10.2307/1913643 Koenker R, Chernozhukov V, He X et al (2017) Handbook of Quantile Regression. Sage publications Lamberti G, Aluja T, Sanchez G (2016) The Pathmox approach for PLS path modeling. Appl Stoch Models Bus Ind 32:453–468. https://doi.org/10.1002/asmb.2168 Lamberti G, Aluja T, Sanchez G (2016) The Pathmox approach for PLS path modeling: discovering which constructs differentiate segments. Appl Stoch Models Bus Ind 33(6):674–689. https://doi.org/10.1002/asmb.2270 Lebart L, Morineau A, Fenelon J (1979) Traitement des donnees statistiques. Dunod, Paris Moore R, Wang C (2021) Influence of learner motivational dispositions on MOOC completion. J Comput High Educ 33(1):121–134. https://doi.org/10.1007/s12528-020-09258-8 Raudenbush S, Bryk A (2002) Hierarchical linear models: applications and data analysis methods. Sage publications Sarstedt M, Henseler J, Ringle C (2011) Multi-group analysis in partial least squares (PLS) path modeling: alternative methods and empirical results. Adv Int Mark 22:195–218. https://doi.org/10.1108/S1474-7979(2011)0000022012 Sengupta S, Volgushev S, Shao X (2016) A subsampled double bootstrap for massive data. J Am Stat Assoc 111(515):1222–1232. https://doi.org/10.48550/arXiv.1508.01126 Siemens G, Long P (2011) Penetrating the fog: analytics in learning and education. EDUCAUSE Rev 46(5):30–40 Snijders T, Bosker R (2011) Multilevel analysis: an introduction to basic and advanced multilevel modeling. Sage publications Team RC (2002) R: A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria, https://www.R-project.org/ Vinzi VE, Chin W, Henseler J et al (2013) Handbook of partial least squares. Springer Handbooks of Computational Statistics, Springer, Berlin Wold H (1985) Partial least squares. In: Kotz S, Johnson N (eds) Encyclopedia of statistical sciences. Wiley & Sons, New York, Heidelberg, pp 581–591 Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17:492–514. https://doi.org/10.1198/106186008X319331 Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Statist 36(3):1108–1126