Multidimensional Item Response Theory in the Style of Collaborative Filtering
Tóm tắt
This paper presents a machine learning approach to multidimensional item response theory (MIRT), a class of latent factor models that can be used to model and predict student performance from observed assessment data. Inspired by collaborative filtering, we define a general class of models that includes many MIRT models. We discuss the use of penalized joint maximum likelihood to estimate individual models and cross-validation to select the best performing model. This model evaluation process can be optimized using batching techniques, such that even sparse large-scale data can be analyzed efficiently. We illustrate our approach with simulated and real data, including an example from a massive open online course. The high-dimensional model fit to this large and sparse dataset does not lend itself well to traditional methods of factor interpretation. By analogy to recommender-system applications, we propose an alternative “validation” of the factor model, using auxiliary information about the popularity of items consulted during an open-book examination in the course.
Tài liệu tham khảo
Alldredge, J., & Gilb, N. (1976). Ridge regression: An annotated bibliography. International Statistical Review, 44(3), 355–360.
Andersen, E. (1970). Asymptotic Properties of Conditional Maximum-likelihood Estimators. Journal of the Royal Statistical Society Series B 32(2), 283–301.
Bartholomew, D. J., Knott, M. & Moutsaki, I. (2011). Latent variable models and factor analysis (3rd ed.). London: Arnold.
Bennett, J. & Lanning, S. (2007). The Netflix prize. In Proceedings of KDD cup and workshop (Vol. 2007, p. 35).
Bergner, Y. (2017). Measurement and its uses in learning analytics. Handbook of learning analytics, 35:35–48.
Bergner, Y., Colvin, K., & Pritchard, D. E. (2015). Estimation of ability from homework items when there are missing and/or multiple attempts. In Proceedings of the fifth international conference on learning analytics and knowledge—LAK ’15.
Bergner, Y., Droschler, S. & Kortemeyer, G. (2012). Model-based collaborative filtering analysis of student response data: Machine-learning item response theory. Educational Data Mining.
Billsus, D., & Pazzani, M. J. (1998). Learning collaborative information filters. In Icml (Vol. 98, pp. 46–54).
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Statistical theories of mental test scores.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459.
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7), 1145–1159.
Browne, M.W. (2001). An Overview of Analytic Rotation in Exploratory Factor Analysis. Multivariate Behavioral Research, 36(1), 111–150.
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75, 33–57.
Cen, H., Koedinger, K., & Junker, B. (2006). Learning factors analysis–a general method for cognitive model evaluation and improvement. In International conference on intelligent tutoring systems (pp. 164–175).
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.
Chen, Y., Li, X., & Zhang, S. (2019a). Joint maximum likelihood estimation for high-dimensional exploratory item factor analysis. Psychometrika, 84(1), 124–146.
Chen, Y., Li, X. & Zhang, S. (2019b). Structured latent factor analysis for large-scale data: Identifiability, estimability, and their implications. arXiv:1712.08966 [stat].
Cho, S. J., & Rabe-Hesketh, S. (2011). Alternating imputation posterior estimation of models with crossed random effects. Computational Statistics & Data Analysis, 55(1), 12–25.
Chrysafiadi, K., & Virvou, M. (2013). Student modeling approaches: A literature review for the last decade. Expert Systems with Applications, 40(11), 4715–4729.
Desmarais, M. C., & Pu, X. (2005). A Bayesian student model without hidden nodes and its comparison with item response theory. International Journal of Artificial Intelligence in Education.
Doan, T.-N., & Sahebi, S. (2019). Rank-based tensor factorization for predicting student performance. In Proceedings of the 12th international conference on educational data mining (pp. 288–293).
Fan, J., & Li, R. (2001). Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Goodman, L. A. & Kruskal, W. H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49(268), 732–764.
Haberman, S. (1977). Maximum Likelihood Estimates in Exponential Response Models. The Annals of Statistics, 5(5), 815–841.
Hastie, T., Tibshirani, R. & Friedman, J. (2009). The elements of statistical learning (2nd ed.). New York: Springer.
Hestenes, D., Wells, M. & Swackhamer, G. (1992). Force concept inventory. The Physics Teacher, 30(3), 141–158.
Hirose, K. & Yamamoto, M. (2015). Sparse estimation via nonconcave penalized likelihood in factor analysis model. Statistics and Computing, 25(5), 863–875.
Holland, P. (1990). On the sampling foundations of item response theory models. Psychometrika, 55(4), 577–601.
Hu, B., Zhou, Y., Wang, J., Li, L., & Shen, L. (2009). Application of Item Response Theory to Collaborative Filtering. In W. Yu, H. He & N. Zhang (eds), Advances in Neural Networks - ISNN 2009 (pp. 766–773). Berlin, Heidelberg: Springer.
Jin, S., Moustaki, I. & Yang-Wallentin, F. (2018). Approximated Penalized Maximum Likelihood for Exploratory Factor Analysis: An Orthogonal Case. Psychometrika, 83(3), 628–649.
Kingma, D. P. & Ba, J. (2015). Adam: A method for stochastic optimization. In 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings. arXiv:1412.6980.
Koren, Y., & Bell, R. (2015). Advances in collaborative filtering. In Recommender systems handbook (pp. 77–118). Springer.
Lan, A., Waters, A., Studer, C. & Baraniuk, R. (2013). Sparse factor analysis for learning and content analytics. arXiv preprint. arXiv:1303.5685.
Lord, F. M. (1980). Applications of Item Response Theory to Practical Testing Problems. New York: Routledge.
Lord, F. M., Novick, M. & Birnbaum, A. (1968). Statistical theories of mental test scores. Addison-Wesly Publishing: UK.
Martin, B., Mitrovic, T., Mathan, S., & Koedinger, K. R. (2010). Evaluating and improving adaptive educational systems with learning curves. User Modeling and User-Adapted Interaction: The Journal of Personalization Research, 21, 249–283.
Palmer, H. (2004). Conditional maximum likelihood estimation. In The SAGE encyclopedia of social science research methods (pp. 168–169). Sage Publications.
Pan, J., Ip, E. H., & Dubé, L. (2017). An alternative to post hoc model modification in confirmatory factor analysis: The Bayesian Lasso. Psychological Methods, 22(4), 687–704.
Pan, J., Ip, E. H., & Dubé, L. (2019). Multilevel heterogeneous factor analysis and application to ecological momentary assessment. Psychometrika.
Pelánek, R. (2016). Applications of the Elo rating system in adaptive educational systems. Computers and Education, 98, 169–179.
Prechelt, L. (1998). Early stopping-but when? In Neural networks: Tricks of the trade (pp. 55–69). Springer.
Reckase, M. (2009). Multidimensional Item Response Theory. New York: Springer.
Reye, J. (2004). Student modelling based on belief networks. International Journal of Artificial Intelligence in Education, 14, 1–33.
Sahebi, S., Lin, Y.-R., & Brusilovsky, P. (2016). Tensor factorization for student modeling and performance prediction in unstructured domain. In Proceedings of the 9th international conference on educational data mining (pp. 502–506).
Seaton, D. T., Bergner, Y., Chuang, I., Mitros, P., & Pritchard, D. E. (2014). Who does what in a massive open online course?. Communications of the ACM, 57(4), 58–65.
Shi, J., Xu, Y., & Baraniuk, R. (2014). Sparse bilinear logistic regression. arXiv preprint 1–27. arXiv:1404.4104.
Stewart, J., Zabriskie, C., Devore, S., & Stewart, G. (2018). Multidimensional item response theory and the Force Concept Inventory. Physical Review Physics Education Research, 14(1), 10137.
Su, X., & Khoshgoftaar, T. M. (2009). A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009 (Section 3), 1–19.
Sun, J., Chen, Y., Liu, J., Ying, Z., & Xin, T. (2016). Latent variable selection for multidimensional item response theory models via L1 regularization. Psychometrika, 81(4), 921–939.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society Series B, 58(1), 267–288.
Trendafilov, N. T. & Adachi, K. (2015). Sparse Versus Simple Structure Loadings. Psychometrika, 80(3), 776–790.
Trendafilov, N. T., Fontanella, S., & Adachi, K. (2017). Sparse Exploratory Factor Analysis. Psychometrika, 82(3), 778–794.
Yao, Y., Rosasco, L., & Caponnetto, A. (2007). On early stopping in gradient descent learning. Constructive Approximation. 26(2), 289–315.
Zhou, Y., Wilkinson, D., Schreiber, R., & Pan, R. (2008). Large-scale parallel collaborative filtering for the Netflix prize. In International conference on algorithmic applications in management (pp. 337–348).
Zhu, Y., Shen, X., & Ye, C. (2016). Personalized Prediction and Sparsity Pursuit in Latent Factor Models. Journal of the American Statistical Association, 111(513), 241–252.