Using country-specific Q-matrices for cognitive diagnostic assessments with international large-scale data

Large-Scale Assessments in Education - Tập 10 Số 1

Jolien Delafontaine¹, Changsheng Chen¹, Jung Yeon Park¹, Wim Van Den Noortgate²

¹Faculty of Psychology and Educational Science, KU Leuven, Leuven, Belgium

²Imec Research Group ITEC, KU Leuven, Kortrijk, Belgium

Tóm tắt

AbstractIn cognitive diagnosis assessment (CDA), the impact of misspecified item-attribute relations (or “Q-matrix”) designed by subject-matter experts has been a great challenge to real-world applications. This study examined parameter estimation of the CDA with the expert-designed Q-matrix and two refined Q-matrices for international large-scale data. Specifically, the G-DINA model was used to analyze TIMSS data for Grade 8 for five selected countries separately; and the need of a refined Q-matrix specific to the country was investigated. The results suggested that the two refined Q-matrices fitted the data better than the expert-designed Q-matrix, and the stepwise validation method performed better than the nonparametric classification method, resulting in a substantively different classification of students in attribute mastery patterns and different item parameter estimates. The results confirmed that the use of country-specific Q-matrices based on the G-DINA model led to a better fit compared to a universal expert-designed Q-matrix.

Từ khóa

Tài liệu tham khảo

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705

Baker, F. B. (2001). The basics of item response theory. Retrieved from http:///ericae.net.irt/baker.

Birenbaum, M., Tatsuoka, C., & Xin, T. (2005). Large-scale diagnostic assessment: Comparison of eighth graders’ mathematics performance in the United States, Singapore and Israel. Assessment in Education: Principles, Policy & Practice, 12(2), 167–181. https://doi.org/10.1080/09695940500143852

Bradshaw, L., Izsák, A., Templin, J., & Jacobson, E. (2014). Diagnosing teachers’ understandings of rational numbers: Building a multidimensional test within the diagnostic classification framework. Educational Measurement: Issues and Practice, 33(1), 2–14. https://doi.org/10.1111/emip.12020

Chen, J. (2017). A residual-based approach to validate Q-matrix specifications. Applied Psychological Measurement, 41(4), 277–293. https://doi.org/10.1177/0146621616686021

Chiu, C. Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis. Applied Psychological Measurement, 37(8), 598–618. https://doi.org/10.1177/0146621613488436

Choi, K. M., Lee, Y. S., & Park, Y. S. (2015). What CDM can tell about what students have learned: An analysis of TIMSS eighth grade mathematics. Eurasia Journal Mathematics, Science & Technology Education. https://doi.org/10.12973/eurasia.2015.1421a

de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45(4), 343–362. https://doi.org/10.1111/j.1745-3984.2008.00069.x

de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115–130. https://doi.org/10.3102/1076998607309474

de la Torre, J. (2011). The Generalized DINA model framework. Psychometrika, 76(2), 179–199. https://doi.org/10.1007/s11336-011-9207-7

de la Torre, J., & Chiu, C. Y. (2016). General method of empirical Q-matrix validation. Psychometrika, 81(2), 253–273. https://doi.org/10.1007/s11336-015-9467-8

Desmarais, M. C., & Naceur, R. (2013). A matrix factorization method for mapping items to skills and for enhancing expert-based q-matrices. In: International Conference on Artificial Intelligence in Education (pp. 441–450). Berlin: Springer.

Groß, J., Robitzsch, A., & George, A. C. (2016). Cognitive diagnosis models for baseline testing of educational standards in math. Journal of Applied Statistics, 43(1), 229–243. https://doi.org/10.1080/02664763.2014.1000841

Hagenaars, J. A., & McCutcheon, A. L. (2002). Applied latent class analysis. Cambridge University Press.

Im, S., & Park, H. J. (2010). A comparison of US and Korean students’ mathematics skills using a cognitive diagnostic testing method: Linkage to instruction. Educational Research and Evaluation, 16(3), 287–301. https://doi.org/10.1080/13803611.2010.523294

Jia, B., Zhu, Z., & Gao, H. (2021). International Ccomparative study of statistics learning trajectories based on PISA data on Cognitive Diagnostic Models. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2021.657858

Johnson, M.S., Lee, Y.S., Park, J.Y., Zhang, Z., & Sachdeva, R. (2013). Comparing attribute distribution across countries: Application to TIMSS 2007 mathematics. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.

Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric Item Response Theory. Applied Psychological Measurement, 25(3), 258–272. https://doi.org/10.1177/01466210122032064

Jurich, D. P., & Bradshaw, L. P. (2014). An illustration of diagnostic classification modeling in student learning outcomes assessment. International Journal of Testing, 14(1), 49–72. https://doi.org/10.1080/15305058.2013.835728

Köhn, H. F., & Chiu, C. Y. (2016). A procedure for assessing the completeness of the Q-matrices of cognitively diagnostic tests. Psychometrika, 82(1), 112–132. https://doi.org/10.1007/s11336-016-9536-7

Köhn, H. F., & Chiu, C. Y. (2018). How to build a complete Q-matrix for a cognitively diagnostic test. Journal of Classification, 35(2), 273–299. https://doi.org/10.1007/s00357-018-9255-0

Little, R. J. (1988). Missing-data adjustments in large surveys. Journal of Business & Economic Statistics, 6(3), 287–296. https://doi.org/10.2307/1391878

Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data (2nd ed.). Wiley.

Liu, J. (2015). On the consistency of Q-matrix estimation: A commentary. Psychometrika, 82(2), 523–527. https://doi.org/10.1007/s11336-015-9487-4

Liu, R., Huggins-Manley, A. C., & Bulut, O. (2017). Retrofitting diagnostic classification models to responses from IRT-based assessment forms. Educational and Psychological Measurement, 78(3), 357–383. https://doi.org/10.1177/0013164416685599

Liu, Y., Andersson, B., Xin, T., Zhang, H., & Wang, L. (2019). Improved Wald statistics for item-level model comparison in diagnostic classification models. Applied Psychological Measurement, 43(5), 402–414. https://doi.org/10.1177/0146621618798664

Liu, Y., Tian, W., & Xin, T. (2016). An application of M2 statistic to evaluate the fit of cognitive diagnostic models. Journal of Educational and Behavioral Statistics, 41(1), 3–26. https://doi.org/10.3102/1076998615621293

Ma, W., & de la Torre, J. (2020a). GDINA: An R package for cognitive diagnosis modeling. Journal of Statistical Software, 93(14), 1–26. https://doi.org/10.18637/jss.v093.i14

Ma, W., & de la Torre, J. (2020b). An empirical Q-matrix validation method for the sequential generalized DINA model. British Journal of Mathematical and Statistical Psychology, 73(1), 142–163. https://doi.org/10.1111/bmsp.12156

Maydeu-Olivares, A. (2013). Goodness-of-fit assessment of item response theory Models. Measurement: Interdisciplinary Research & Perspective, 11(3), 71–101. Doi: https://doi.org/10.1080/15366367.2013.831680

Maydeu-Olivares, A., Cai, L., & Hernández, A. (2011). Comparing the fit of item response theory and factor analysis models. Structural Equation Modeling: A Multidisciplinary Journal, 18(3), 333–356. https://doi.org/10.1080/10705511.2011.581993

Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49(4), 305–328. https://doi.org/10.1080/00273171.2014.911075

Mullis, I. V., Martin, M. O., Foy, P., & Arora, A. (2012). TIMSS 2011 international results in mathematics. International Association for the Evaluation of Educational Achievement (IEA). Amsterdam: IEA Secretariat.

Nájera, P., Sorrel, M. A., & Abad, F. J. (2019). Reconsidering cutoff points in the general method of empirical Q-matrix validation. Educational and Psychological Measurement, 79(4), 727–753. https://doi.org/10.1177/0013164418822700

Park, J. Y., Lee, Y. S., & Johnson, M. S. (2017). An efficient standard error estimator of the DINA model parameters when analyzing clustered data. International Journal of Quantitative Research in Education, 4(1/2), 244–264. https://doi.org/10.1504/ijqre.2017.10007548

R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

Ravand, H., & Robitzsch, A. (2015). Cognitive diagnostic modeling using R. Practical Assessment, Research & Evaluation, 20(1), 11. https://doi.org/10.7275/5g6f-ak15

Rubin, D. B. (1986). Statistical matching using file concatenation with adjusted weights and multiple imputations. Journal of Business & Economic Statistics, 4(1), 87–94. https://doi.org/10.2307/1391390

Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford Press.

Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136

Sedat, ŞE. N., & Arican, M. (2015). A diagnostic comparison of Turkish and Korean students’ mathematics performances on the TIMSS 2011 assessment. Eğitimde Ve Psikolojide Ölçme Ve Değerlendirme Dergisi, 6(2), 238–253. https://doi.org/10.21031/epod.65266

Sessoms, J., & Henson, R. A. (2018). Applications of diagnostic classification models: A literature review and critical commentary. Measurement: Interdisciplinary Research and Perspectives, 16(1), 1–17. https://doi.org/10.1080/15366367.2018.1435104

Tatsuoka, K. K. (1984). Analysis of errors in fraction addition and subtraction problems. Final Report. Retrieved from University of Illinois, Computer-Based Education Research Lab website: https://files.eric.ed.gov/fulltext/ED257665.pdf.

Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287–305. https://doi.org/10.1037/1082-989x.11.3.287

Terzi, R., & de la Torre, J. (2018). An iterative method for empirically-based Q-matrix validation. International Journal of Assessment Tools in Education, 5(2), 248–262. https://doi.org/10.21449/ijate.407193

von Davier, M., & Lee, Y. S. (2019). Handbook of diagnostic classification models: Models and model extensions, applications, software packages. Springer Publishing.

Wang, W., Song, L., Ding, S., Meng, Y., Cao, C., & Jie, Y. (2018). An EM-based method for Q-matrix validation. Applied Psychological Measurement, 42(6), 446–459. https://doi.org/10.1177/0146621617752991

Wu, X., Wu, R., Chang, H. H., Kong, Q., & Zhang, Y. (2020). International comparative study on PISA mathematics achievement test based on cognitive diagnostic models. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2020.02230

Zheng, Y., Chiu, C.-Y., & Douglas, J. (2019). NPCD: Nonparametric methods for cognitive diagnosis; R Package Version 1.0–11. https://CRAN.R-project.org/package=NPCD

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA