Asymptotically Correct Standardization of Person-Fit Statistics Beyond Dichotomous Items
Tóm tắt
Từ khóa
Tài liệu tham khảo
Armstrong, R., Stoumbos, Z., Kung, M., & Shi, M. (2007). On the performance of $$l_z$$ l z person-fit statistic. Practical Assessment, Research, and Evaluation, 12(16), 1–10.
Chon, K. H., Lee, W., & Ansley, T. N. (2013). An empirical investigation of methods for assessing item fit for mixed format tests. Applied Measurement in Education, 26, 1–15.
Chon, K. H., Lee, W., & Dunbar, S. B. (2010). A comparison of item fit statistics for mixed IRT models. Journal of Educational Measurement, 47, 318–338.
Costa, P. T., & McCrae, R. R. (1992). Normal personality assessment in clinical practice: The NEO personality inventory. Psychological Assessment, 4, 5–13.
Drasgow, F., Levine, M. V., & McLaughlin, M. E. (1987). Detecting inappropriate test scores with optimal and practical appropriateness indices. Applied Psychological Measurement, 11, 59–79.
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 67–86.
Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32, 224–247.
Finkelman, M., & Kim, W. (2007). Using person fit in a body of work standard setting. Paper presented at the Annual meeting of the American Education Research Association, Chicago, IL.
Glas, C. A. W., & Dagohoy, A. V. T. (2007). A person fit test for IRT models for polytomous items. Psychometrika, 72, 159–180.
Glas, C. A. W., & Meijer, R. R. (2003). A Bayesian approach to person fit analysis in item response theory models. Applied Psychological Measurement, 27, 217–233.
Hanson, B., & Harris, D. J. (1994). A comparison of several statistical methods for examining allegations of copying (ACT research report series no. 87–15). Iowa City, IA: American College Testing.
Hoadley, B. (1971). Asymptotic properties of maximum likelihood estimators for the independent not identically distributed case. The Annals of Mathematical Statistics, 42, 1977–1991.
Klauer, K. C. (1991). An exact and optimal standardized person test for assessing consistency with the Rasch model. Psychometrika, 56, 213–228.
Kolen, M. J., & Lee, W. (2011). Psychometric properties of scores on mixed-format tests. Educational Measurement: Issues and Practice, 30(2), 15–24.
Levine, M. V., & Rubin, D. B. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4, 269–290.
Li, M. F., & Olenik, S. (1997). The power of Rasch person-fit statistics in detecting unusual response patterns. Applied Psychological Measurement, 21, 215–231.
Magis, D. (2015). A note on weighted likelihood and jeffreys modal estimation of proficiency levels in polytomous item response models. Psychometrika, 80, 200–204.
Magis, D., Beland, S., & Raiche, G. (2014). Snijders’s correction of infit and outfit indexes with estimated ability level: An analysis with the Rasch model. Journal of Applied Measurement, 15, 82–93.
Magis, D., Raiche, G., & Beland, S. (2012). A didactic presentation of Snijders’s $$l^*_z$$ l z ∗ index of person fit with emphasis on response model selection and ability estimation. Journal of Educational and Behavioral Statistics, 37, 57–81.
Magis, D., & Verhelst, N. (2014). On the finiteness and uniqueness of the weighted likelihood estimator of ability in polytomous IRT models. Research Center for Examination and Certification Workshop on IRT and Educational Measurement, University of Twente, The Netherlands.
Meijer, R. R., Egberink, I. J., Emons, W. H., & Sijtsma, K. (2008). Detection and validation of unscalable item score patterns using item response theory: An illustration with harters self-perception profile for children. Journal of Personality Assessment, 90, 227–238.
Meijer, R. R., & Nering, M. L. (1997). Trait level estimation for nonfitting response vectors. Applied Psychological Measurement, 21, 321–336.
Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107–135.
Meijer, R. R., & Tendeiro, J. N. (2012). The use of the $$l_z$$ l z and $$l^*_z$$ l z ∗ person-fit statistics and problems derived from model misspecification. Journal of Educational and Behavioral Statistics, 37, 758–766.
Molenaar, I. W., & Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55, 75–106.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159–176.
R Core Team. (2015). R: A language and environment for statistical computing. Austria: Vienna.
Rohatgi, V. K., & Saleh, A. K. M. E. (2001). An introduction to probability and statistics. New York, NY: Wiley.
Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Annals of Statistics, 12, 1151–1172.
Samejima, F. (1973). Estimation of latent ability using a pattern of graded scores. Psychometrika, 38, 203–219.
Sijtsma, K., & Meijer, R. R. (2001). The person response function as a tool in person-fit research. Psychometrika, 66, 191–207.
Sinharay, S. (2015). A note on the asymptotic distribution of estimates of the ability parameter: Beyond dichotomous items and unidimensional IRT models. (under review).
Sinharay, S. Assessment of person fit for mixed-format tests. Journal of Educational and Behavioral Statistics. (in press).
Sinharay, S., Wan, P., Whitaker, M., Kim, D., Zhang, L., & Choi, S. W. (2014). Determining the overall impact of interruptions during online testing. Journal of Educational Measurement, 51, 419–440.
Smith, R. M. (1986). Person fit in the Rasch model. Educational and Psychological Measurement, 46, 359–372.
Snijders, T. (2001). Asymptotic distribution of person-fit statistics with estimated person parameter. Psychometrika, 66, 331–342.
Tao, J., Shi, N., & Chang, H. (2012). Item-weighted likelihood method for ability estimation in tests composed of both dichotomous and polytomous items. Journal of Educational and Behavioral Statistics, 37, 298–315.
Tendeiro, J. N., & Meijer, R. R. (2014). Detection of invalid test scores: The usefulness of simple nonparametric statistics. Journal of Educational Measurement, 51, 239–259.
van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (1999). The null distribution of person-fit statistics for conventional and adaptive tests. Applied Psychological Measurement, 23, 327–345.
van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (2002). Detection of person misfit in computerized adaptive tests with polytomous items. Applied Psychological Measurement, 26, 164–180.
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427–450.
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis [Computer Software]. Chicago, IL: Mesa Press.
Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago, IL: Mesa Press.