Likelihood-Based Item-Fit Indices for Dichotomous Item Response Theory Models
Tóm tắt
New goodness-of-fit indices are introduced for dichotomous item response theory (IRT) models. These indices are based on the likelihoods of number-correct scores derived from the IRT model, and they provide a direct comparison of the modeled and observed frequencies for correct and incorrect responses for each number-correct score. The behavior of Pearson’s X2 ( S- X2) and the likelihood ratio G2 ( S- G2) was assessed in a simulation study and compared with two fit indices similar to those currently in use ( Q1- X2 and Q1- G2). The simulations included three conditions in which the simulating and fitting models were identical and three conditions involving model misspecification. S- X2 performed well, with Type I error rates close to the expected .05 and .01 levels. Performance of this index improved with increased test length. S- G2 tended to reject the null hypothesis too often, as did Q1- X2 and Q1- G2. The power of S- X2 appeared to be similar for all test lengths, but varied depending on the type of model misspecification.
Từ khóa
Tài liệu tham khảo
Ankenmann, R. (1994). Goodness of fit and ability estimation in the graded response model. Unpublished manuscript.
Chen, W. (1995). Estimation of item parameters for the three-parameter logistic model using the marginal likelihood of summed scores (Doctoral dissertation, University of North Carolina, 1995). Dissertation Abstracts International, 56/10-B, 5825.
Lord, F. M., 1980, Applications of item response theory to practical testing problems
Mislevy, R. J., 1986, Bilog: Item analysis and test scoring with binary logistic models
Orlando, M. (1997). Item fit in the context of item response theory. (Doctoral dissertation, University of North Carolina, 1997). Dissertation Abstracts International, 58/04-B, 2175.
Rasch, G., 1960, Probabilistic models for some intelligence and attainment tests
Thissen, D., 1991, MULTILOG user’s guide: Multiple categorical item analysis and test scoring using item response theory
Wainer, H., 1990, Computerized adaptive testing: A primer, 65
Wright, B., 1977, BICAL: Calibrating items and scales with the Rasch model