Investigating the characteristics of language test specifications and item writer guidelines, and their effect on item development: a mixed-method case study

Springer Science and Business Media LLC - Tập 13 - Trang 1-17 - 2023

Zahra Ali Al Lawati¹

¹Centre for Preparatory Studies, Sultan Qaboos University, Muscat, Oman

Tóm tắt

This study discusses the characteristics of test specifications (specs) and item writer guidelines (IWGs), their role in item development of English as a Second Language (ESL) reading tests, and the use of the CEFR for specs development. This mixed-method study analyzed specs, IWGs, tests, and the Pearson Test of English General test statistics. Moreover, interviews and focus groups were conducted with the specs’ developers, IWGs, and item writers. The findings show no unique way of conceptualizing specs and IWGs. Moreover, translating the CEFR reading descriptors into specs is a challenging task. However, results from the judgmental study and item statistics suggest that the investigated specs and IWGs facilitated the development of good-quality items at a certain difficulty level. This study reveals the potential role of specs and IWGs in establishing test validity. This research contributes to understanding the under-researched area of specs and IWGs and shows the type of information required for effective item writing and ways of enhancing the validity and reliability of tests. Practical and theoretical suggestions and future research have also been identified.

Tài liệu tham khảo

Alderson JC, Cseresznyés M. (2005). Reading and use of English. In J. C. Alderson (Eds.), Into Europe: Prepare for modern English exams (Vol. 1, pp. 1–297). Available from http://www.lancs.ac.uk/fass/projects/examreform/into_europe/Reading_and_Use_of_English.pdf Alderson, J. C. (2000). Assessing reading. Cambridge University Press. https://doi.org/10.1017/CBO9780511732935 Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge University Press. Alderson, J. C., Figueras, N., Kuijper, H., Nold, G., Takala, S., & Tardieu, C. (2006). Analyzing reading and listening tests about the Common European Framework of Reference. The experience of the Dutch CEFR construct project. Language Assessment Quarterly, 3(1), 3–30. https://doi.org/10.1207/s15434311laq0301_2 Arhin, A. K., Essuman, J., & Arhin, E. (2021). Analysis of item writing flaws in a communications skills test in a Ghanaian University. Afr J Teach Educ, 10(2), 121–143. https://doi.org/10.21083/ajote.v10i2.6762 Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press. Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice. Oxford University Press. Belyazid S. (1996). Task-based language test specifications designed for an adult TEFL context in Morocco. Unpublished MA thesis, University of Illinois at Urbana-Champaign, USA. Cho D. (1995). The effect of specificity of language test specifications on item construction. Unpublished PhD thesis, University of Illinois at Urbana-Champaign, USA. Davidson, F. (2012b). Test specifications and criterion-referenced assessment. In G. Fulcher & F. Davidson (Eds.), The Routledge handbook of language testing (pp. 197–207). New York: Routledge. https://doi.org/10.4324/9780203181287.ch13 Davidson, F. (2012). Releasability of language test specifications. Japan Language Testing Association (JLTA) Journal, 15, 1–23. Davidson, F., & Fulcher, G. (2007). The Common European Framework of Reference (CEFR) and the design of language tests: a matter of effect. Language Teaching, 40, 231–241. https://doi.org/10.1017/S0261444807004351 Davidson, F., & Lynch, B. K. (2002). Test craft. Yale University Press. Fulcher, G. (2021a). Language Assessment Literacy in a Learning-Oriented Assessment Framework. In A. Gebril (Ed.), Learning-oriented assessment: Putting theory into practice (pp. 254–270). New York: Routledge. https://doi.org/10.4324/9781003014102 Fulcher, G., & Davidson, F. (2007). Language testing and assessment: An advanced resource book. Routledge. Fulcher, G., Panahi, A., & Mohebbi, H. (2022). Glenn Fulcher’s thirty-five years of contribution to language testing and assessment: a systematic review. Language Teaching Research Quarterly, 29, 20–56. https://doi.org/10.32038/ltrq.2022.29.03 Green, A., & Hawkey, R. (2011). Re-fitting for a different purpose: a case study of item writer practices in adapting source texts for a test of academic reading. Language Testing, 29(1), 109. https://doi.org/10.1177/0265532211413445 Gutiérrez Baffil, T. G., & Collada Peña, I. D. L. C. (2022). Assessing writing in English in Cuban higher education. Transformación, 18(1), 238–252. Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15(3), 309–334. https://doi.org/10.1207/S15324818AME1503_5 Haladyna, T. M., & Rodriguez, M. C. (2021). Using full-information item analysis to improve item quality. Educational Assessment, 26(3), 198–211. https://doi.org/10.1287/ited.2022.0274 Hambleton, R. K., & Eignor, D. (1979). A practitioner’s guide to criterion-referenced test development, validation, and test score usage (Report No. 70) (2nd ed.). University of Massachusetts. Harsch, C., & Seyferth, S. (2019). Marrying achievement with proficiency in developing and validating a local CEFR-based writing checklist. Assessing Writing, 43, 10–43. https://doi.org/10.32038/ltrq.2021.26.02 Hughes, A. (1989). Testing for language teachers. Cambridge University Press. Huhta, A., Luoma, S., Oscarson, M., Sajavaara, K., Takala, S., & Teasdale, A. (2002). A diagnostic language assessment system for adult learners. J. C. Alderson (Ed.), Common European Framework of Reference for Languages: Learning, teaching, assessment. Case studies (pp. 130–145). Council of Europe. Jin Y. (2021). Test specifications. In Fulcher, G &. Hardling, L (Eds.), The Routledge handbook of language testing (pp.271–288). Taylor & Frances. Jones, N. (2002). Relating the ALTE framework to the Common European Framework of Reference. J. C. Alderson (Ed.), Common European Framework of Reference for Languages: Learning, teaching, assessment. Case studies (pp. 167–183). Council of Europe. Kennedy, L. C. (2007). Expanding test specifications with rhetorical genre studies and activity theory analyses. Unpublished MA thesis, Carleton University, Ottawa, Ontario. Kim, J., Chi, Y., Huensch, A., Jun, H., Li, H., & Roullion, V. (2010). A case study on an item-writing process: use of test specifications, nature of group dynamics, and individual item writers’ characteristics. Language Assessment Quarterly, 7, 160–174. Li, J. (2006). Introducing Audit Trails to the World of Language Testing. Unpublished MA thesis, University of Illinois at Urbana-Champaign, USA. Norris, J. M., Brown, J. D., Hudson, T., & Yoshioka, J. (1998). Designing second language performance assessments. Honolulu: Second language teaching & curriculum Centre, University of Hawaii/University of Hawaii Press. Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance, and other formats (2nd ed.). Kluwer Academic Publishers. Pearson. (2012). Test Centre handbook Available from http://pearsonpte.com/TestCenters/Pages/Resources.aspx Popham, W. J. (1978). Criterion-referenced measurement. Prentice-Hall. Roid, G. H., & Haladyna, T. M. (1982). A technology for test-item-writing. Academic Press. Rossi, O., & Brunfaut, T. (2021). Text authenticity in listening assessment: can item writers be trained to produce authentic-sounding texts? Language Assessment Quarterly, 18(4), 398–418. https://doi.org/10.1080/15434303.2021.1895162 Salisbury, K. (2005). The edge of expertise: towards an understanding of listening test-item-writing as professional practice: unpublished PhD thesis, King's College, University of London. https://doi.org/10.1002/9781118784235.eelt0981 Shi, D. (2021). Item writing and item writers. In Fulcher, G &. Hardling, L (Eds.), The Routledge handbook of language testing (pp.341–356). Taylor & Frances. Stake, R. E. (2008). Qualitative case studies. In N. K. Denzin & Y. S. Lincoln (Eds.), Strategies of qualitative inquiry (pp. 119–149). Sage. Tashakkori, A., & Teddlie, C. (2003). The past and future of mixed-methods research: From data triangulation to mixed model designs. In A. Tashakkori & C. Teddlie (Eds.), Handbook of mixed methods in social and behavioral research (pp. 671–702). Sage Publications. https://doi.org/10.4135/9781506335193 Tinkelman, S. N. (1971). Planning the objective test. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 46–80). American Council on Education. Weir, C. J. (2005). Limitations of the Common European Framework for developing comparable examinations and tests. Language Testing, pp. 22, 281–300. https://doi.org/10.1191/0265532205lt309oa Bachman, L. F. (2002). Some reflections on task-based language performance assessment. Language Testing, 19(4), 453–476. https://doi.org/10.1191/0265532202lt240oa. Little, T. D., Simpson, R. B., & O'Connor, P. (2002). Statistical methods for research in education and psychology (Third edition). Pearson Education. Morgan, D.L. (1997). Focus Groups as Qualitative Research. 2nd Edition. Thousand Oaks: Sage. Morrow, K. (Ed.). (2004). Insights from the Common European Framework. Oxford University Press.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA