Measuring differential item and test functioning across academic disciplines
Tóm tắt
Từ khóa
Tài liệu tham khảo
Alavi SM, Karami H: Differential item functioning and ad hoc interpretations. TELL 2010, 4(1):1–18.
Alavi SM, Rezaee AA, Amirian SMR: Academic discipline DIF in an English language proficiency test. Journal of English Language Teaching and Learning 2011, 5(7):39–65.
Alderson JC, Urquhart A: The effect of students’ academic discipline on their performance on ESP reading tests. Language Testing 1985, 2: 192–204. 10.1177/026553228500200207
Bachman LF: Fundamental considerations in language testing. Oxford: Oxford University Press; 1990.
Beglar D: A Rasch-based validation of the vocabulary size test. Language Testing 2010, 27: 101–118. 10.1177/0265532209340194
Bond TG, Fox CM: Applying the Rasch model: Fundamental measurement in the human sciences. Mahwah, NJ: LEA; 2001.
Bond TG, Fox CM: Applying the Rasch model: Fundamentalmeasurement in the human sciences. 2nd edition. New Jersey: Lawrence Erlbaum; 2007.
Camilli G: Test fairness. In Educational measurement. 4th edition. Edited by: Brennan R. New York: American Council on Education & Praeger series on higher education; 2006:221–256.
Chang HH, Mazzeo J: The unique correspondence of the item response function and item category response functions in polytomously scored item response models. Psychometrika 1994, 59: 391–404. 10.1007/BF02296132
Chapman M: A case study of the need for change in the language testing policies of a Japanese corporation. JLTA Journal 2005, 8: 51–67.
Chen Z, Henning G: Linguistic and cultural bias in language proficiency tests. Language Testing 1985, 2(2):155–163. 10.1177/026553228500200204
Clauser EB, Mazor MK: Using statistical procedures to identify differentially functioning test items. Educational Measurement: Issues and Practice 1998, 17: 31–44.
Cole NS: History and development of DIF. Hillsdale, NJ, England: Lawrence Erlbaum Associates, Inc.; 1993.
Donoghue JR, Holland PW, Thayer DT: A Monte Carlo study of factors that affect the mantel-haenszel and standardization measures of differential item functioning. In Differential item functioning. Edited by: Holland PW, Wainer H. Hillsdale, NJ: Lawrence Erlbaum Associates; 1993:137–166.
Elder C: The effect of language background on “foreign” language test performance: The case of Chinese, Italian, andModern Greek. Language Learning 1996, 46: 233–282. 10.1111/j.1467-1770.1996.tb01236.x
Geranpayeh A, Kunnan AJ: Differential item functioning in terms of age in the certificate in advanced English examination. Language Assessment Quarterly 2007, 4: 190–222.
Karami H: Detecting gender bias in a language proficiency test. International Journal of Language Studies 2011, 5(2):27–38.
Linacre JM: Test validity and Rasch measurement: Construct, content, etc. Rasch Measurement Transactions 2004, 18(1):970–971.
Linacre JM: A user's guide to WINSTEPS-MINISTEP: Rasch-model computer programs. Chicago, IL; 2007. winsteps.com
Linacre JM: A user's guide to winsteps/ministeps: Rasch model computer programs. Chicago, IL; 2008. winsteps.com
Linacre JM: Winsteps® (Version 3.70.0) [Computer Software]. Beaverton, Oregon: Winsteps.com; 2010.
Lumley T, O’ Sullivan B: The effect of test-taker gender, audience and topic on task performance in tape-mediated assessment of speaking. Language Testing 2005, 22(4):415–437. 10.1191/0265532205lt303oa
Meade AW, Fetzer M: Test bias, differential prediction, and a revised approach for determining the suitability of a predictor in a selection context. Organizational Research Methods 2009, 12(4):738–761. 10.1177/1094428109331487
Messick S: Validity. In Educational measurement. Edited by: Linn RL. New York: Macmillan; 1989:13–103.
Pae T: DIF for learners with different academic backgrounds. Language Testing 2004, 21: 53–73. 10.1191/0265532204lt274oa
Raju NS, van der Linden WJ, Fleer PF: IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement 1995, 19: 353–368. 10.1177/014662169501900405
Rasch G: Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press; 1980. (Expand ed)
Runnels J: Using the Rasch model to validate a multiple choice English achievement test. International Journal of Language Studies 2012, 6(4):141–155.
Smith EV: Metric development and score reporting in Rasch measurement. Journal of Applied Measurement 2000, 1: 303–326.
Smith EV: Evidence for the reliability of measures and validity of measure interpretation: A Rasch measurement perspective. Journal of Applied Measurement 2001, 2: 281–311.
Smith R: Detecting item bias with the Rasch model. Journal of Applied Measurement 2004, 5(4):430–449.
Smith RM, Plackner C: The family approach to assessing fit in Rasch measurement. Journal of Applied Measurement 2009, 10(4):424–437.
Swaminathan H, Rogers HJ: Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement 1990, 27: 361–370. 10.1111/j.1745-3984.1990.tb00754.x
Takala S, Kaftandjieva F: Test fairness: A DIF analysis of an L2 vocabulary test. Language Testing 2000, 17: 323–340.
Thompson B: Foundations of behavioral statistics: An insight-based approach. London: The Guilford Press; 2006.
Weir CJ: Language testing and validation: An evidence-based approach. Hampshire, UK: Palgave-Macmillan; 2005.
Weaver C: A Rasch-based evaluation of the presence of item bias in a placement examination designed for an EFL reading program. In Second Language Acquisition - Theory and Pedagogy: Proceedings of the 6th Annual JALT Pan-SIG Conference Edited by: Newfields T, Gledall I, Wanner P, Kawate-Mierzejewska M. 2007. Retrieved September 21, 2012 from http://jalt.org/pansig/2007/HTML/Weaver.htm
Wolfe EW: Equating and item banking with the Rasch model. In Introduction to Rasch measurement. Edited by: Smith E, Smith R. Maple Grove, MN: JAM Press; 2004:360–390.
Wright BD, Masters GN: Number of person or item strata. Rasch MeasurementTransactions 2002, 16: 888.
Wright BD, Stone MH: Best test design. Chicago: MESA Press; 1979.