Assessment by Comparative Judgement: An Application to Secondary Statistics and English in New Zealand
Tóm tắt
Từ khóa
Tài liệu tham khảo
Alomran, M., & Chia, D. (2018). Automated scoring system for multiple choice test with quick feedback. International Journal of Information and Education Technology, 8, 538–545. https://doi.org/10.18178/ijiet.2018.8.8.1096.
Assessment Research Group. (2009). Assessment in schools: Fit for purpose? A commentary by the teaching and learning research programme. London: Economic and Social Research Council.
Baird, J.-A., Andrich, D., Hopfenbeck, T. N., & Stobart, G. (2017). Assessment and learning: Fields apart? Assessment in Education: Principles, Policy & Practice, 24, 317–350. https://doi.org/10.1080/0969594X.2017.1319337.
Berkowitz, B. W., Fitch, R., & Kopriva, R. (2000). The Use of Tests as part of high-stakes decision-making for students: A resource guide for educators and policy-makers. Washington, DC: Office for Civil Rights (ED).
Bisson, M., -J., Gilmore, C., Inglis, M., & Jones, I. (2016). Measuring conceptual understanding using comparative judgement. International Journal of Research in Undergraduate Mathematics Education, 2, 141–164. https://doi.org/10.1007/s40753-016-0024-3.
Black, P., Burkhardt, H., Daro, P., Jones, I., Lappan, G., Pead, D., & Stephens, M. (2012). High-stakes examinations to support policy. Educational Designer, 2(5), 1–31. https://www.educationaldesigner.org/ed/volume2/issue5/article16/
Bramley, T. (2007). Paired comparison methods. In P. Newton, J.-A. Baird, H. Goldstein, H. Patrick, & P. Tymms (Eds.), Techniques for Monitoring the comparability of examination standards (pp. 264–294). London: QCA.
Heldsinger, S., & Humphry, S. (2010). Using the method of pairwise comparison to obtain reliable teacher assessments. The Australian Educational Researcher, 37, 1–19. https://doi.org/10.1007/BF03216919.
Hipkins, R., Johnston, M., & Sheehan, M. (2016). NCEA in context. Wellington, New Zealand: NZCER Press. https://www.nzcer.org.nz/nzcerpress/ncea-context.
Hunter, J., & Jones, I. (2018). Free-response tasks in primary mathematics: a window on students’ thinking. In Proceedings of the 41st annual conference of the Mathematics Education Research Group of Australasia (Vol. 41, pp. 400–407). Auckland, New Zealand: MERGA.
Jones, I., & Alcock, L. (2014). Peer assessment without assessment criteria. Studies in Higher Education, 39(10), 1774–1787. https://doi.org/10.1080/03075079.2013.821974.
Jones, I., Bisson, M., Gilmore, C., & Inglis, M. (2019). Measuring conceptual understanding in randomised controlled trials: Can comparative judgement help? British Educational Research Journal, 45, 662–680. https://doi.org/10.1002/berj.3519.
Jones, I., & Inglis, M. (2015). The problem of assessing problem solving: Can comparative judgement help? Educational Studies in Mathematics, 89, 337–355. https://doi.org/10.1007/s10649-015-9607-.
Jones, I., Inglis, M., Gilmore, C., & Hodgen, J. (2013). Measuring conceptual understanding: The case of fractions. In A. M. Lindmeier & A. Heinze (Eds.), Proceedings of the 37th Conference of the international group for the psychology of mathematics education (Vol. 3, pp. 113–120). Kiel, Germany: IGPME.
Jones, I., & Karadeniz, I. (2016). An alternative approach to assessing achievement. In C. Csikos, A. Rausch, & J. Szitanyi (Eds.), The 40th Conference of the International Group for the Psychology of Mathematics Education (Vol. 3, pp. 51–58). Szeged, Hungary: IGPME.
Jones, I., & Sirl, D. (2017). Peer assessment of mathematical understanding using comparative judgement. Nordic Studies in Mathematics Education, 22, 147–164.
Jones, I., Swan, M., & Pollitt, A. (2014). Assessing mathematical problem solving using comparative judgement. International Journal of Science and Mathematics Education, 13(1), 151–177. https://doi.org/10.1007/s10763-013-9497-6.
Jones, I., & Wheadon, C. (2015). Peer assessment using comparative and absolute judgement. Studies in Educational Evaluation, 47, 93–101. https://doi.org/10.1016/j.stueduc.2015.09.004.
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73. https://doi.org/10.1111/jedm.12000.
Meadows, M., & Billington, L. (2005). A review of the literature on marking reliability. Manchester: AQA & the National Assessment Agency.
Murphy, R. (1982). A further report of investigations into the reliability of marking of GCE examinations. British Journal of Educational Psychology, 52, 58–63. https://doi.org/10.1111/j.2044-8279.1982.tb02503.x.
Newton, P. (1996). The reliability of marking of General Certificate of Secondary Education scripts: Mathematics and English. British Educational Research Journal, 22, 405–420. https://doi.org/10.1080/0141192960220403.
Newton, P., & Shaw, S. (2014). Validity in educational and psychological assessment. Cambridge: Sage Publications.
Pollitt, A. (2012). The method of adaptive comparative judgement. Assessment in Education: Principles, Policy & Practice, 19, 281–300. https://doi.org/10.1080/0969594X.2012.665354.
Steedle, J. T., & Ferrara, S. (2016). Evaluating comparative judgment as an approach to essay scoring. Applied Measurement in Education, 29, 211–223. https://doi.org/10.1080/08957347.2016.1171769.
Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87, 245.
Suto, W. M. I., & Nadas, R. (2009). Why are some GCSE examination questions harder to mark accurately than others? Using Kelly’s Repertory Grid technique to identify relevant question features. Research Papers in Education, 24, 335–377. https://doi.org/10.1080/02671520801945925.
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286. https://doi.org/10.1037/h0070288.
Thurstone, L. L. (1954). The measurement of values. Psychological Review, 61, 47–58. https://doi.org/10.1037/h0060035.
van Daal, T., Lesterhuis, M., Coertjens, L., Donche, V., & De Maeyer, S. (2019). Validity of comparative judgement to assess academic writing: Examining implications of its holistic character and building on a shared consensus. Assessment in Education: Principles, Policy & Practice, 26, 59–74. https://doi.org/10.1080/0969594X.2016.1253542.
Verhavert, S., Bouwer, R., Donche, V., & Maeyer, S. D. (2019). A meta-analysis on the reliability of comparative judgement. Assessment in Education: Principles, Policy & Practice, 26, 1–22. https://doi.org/10.1080/0969594X.2019.1602027.
Wiliam, D. (2001). Reliability, validity, and all that jazz. Education 3–13: International Journal of Primary. Elementary and Early Years Education, 29, 17–21. https://doi.org/10.1080/03004270185200311.
