Rater effects: Ego engagement in rater decision-making
Tài liệu tham khảo
Anderson, 1983
Bachman, 1995, Investigating variability in asks and rater judgments in a performance test of foreign language speaking, Language Testing, 12, 238, 10.1177/026553229501200206
Bachman, 1996
Barkaoui, 2010, Variability in ESL essay rating processes: The role of the rating scale and rater experience, Language Assessment Quarterly, 7, 54, 10.1080/15434300903464418
Breland, H., Bridgeman, B., & Fowles, M. E. (1999). Writing assessment in admission to higher education: Review and framework (ETS RR-99-3). Princeton, NJ: Educational Testing Service.
Brennan, 1995, Generalizability analyses of work key listening and writing tests, Educational and Psychological Measurement, 55, 157, 10.1177/0013164495055002001
Brown, 1991, Do English and ESL faculties rate writing samples differently?, TESOL Quarterly, 25, 587, 10.2307/3587078
Caracelli, 1993, Data analysis strategies for mixed-method evaluation designs, Educational Evaluation and Policy Analysis, 15, 195, 10.3102/01623737015002195
Connor, 1993, The interpretation of tasks by writers and readers in holistically rated direct assessment of writing, 141
Creswell, 2007
Cronbach, 1990
Cumming, 1990, Expertise in evaluating second language compositions, Language Testing, 7, 31, 10.1177/026553229000700104
Cumming, 2001, Assessing L2 writing: Alternative constructs and ethical dilemmas, Assessing Writing, 8, 73, 10.1016/S1075-2935(02)00047-8
Cumming, A., Kantor, R., Powers, D., Santos, T., & Taylor, C. (2000). TOEFL 2000 writing framework: A working paper (TOEFL Monograph Series Repot No. 18). Princeton, NJ: Educational Testing Service.
DeRemer, 1998, Writing assessment: Raters’ elaboration of the rating task, Assessing Writing, 5, 7, 10.1016/S1075-2935(99)80003-8
Engelhard, 1994, Examining rater errors in the assessment of written composition with a many-faceted Rasch model, Journal of Educational Measurement, 31, 93, 10.1111/j.1745-3984.1994.tb00436.x
Ericsson, 1984
Fulcher, 2003
Guilford, 1954
Hamp-Lyons, 1994, Examining expert judgments of task difficulty on essay tests, Journal of Second Language Writing, 3, 49, 10.1016/1060-3743(94)90005-1
Henning, 1996, Accounting for nonsystematic error in performance ratings, Language Testing, 13, 53, 10.1177/026553229601300104
Huot, 1993, The influence of holistic scoring procedures on reading and rating student essays
Johnson, 2009, The influence of rater language background on writing performance assessment, Language Testing, 26, 485, 10.1177/0265532209340186
Kim, 2009, An investigation into native and non-native teachers’ judgments of oral English performance: A mixed methods approach, Language Testing, 26, 187, 10.1177/0265532208101010
Kobayashi, 1992, Native and nonnative reactions to ESL compositions, TESOL Quarterly, 26, 81, 10.2307/3587370
Kondo-Brown, 2002, A FACETS analysis of rater bias in measuring Japanese second language writing performance, Language Testing, 19, 3, 10.1191/0265532202lt218oa
Linacre, J. M. (2005). A user's guide to FACETS: Rasch measurement computer program. Version 3.57. Chicago, IL.
Lumley, 2005
Lynch, 1998, Using G-theory and many-facet Rasch measurement in the development of performance assessments of the ESL speaking skills in immigrants, Language Testing, 15, 158, 10.1177/026553229801500202
McNamara, 1996
Mendelsohn, 1987, Professors’ ratings of language use and rhetorical organizations in ESL compositions, TESL Canada Journal, 5, 9, 10.18806/tesl.v5i1.512
Milanovic, 1996, A study of the decision-making behavior of composition markers, 92
Myford, C., & Wolfe, E. (2000). Monitoring sources of variability within the Test of Spoken English assessment system (Research Project 65). Princeton, NJ: Educational Testing Service.
Myford, 2004, Detecting and measuring Edward effects using many-facet Rasch measurement: Part 1, 460
Orr, 2002, The FCE speaking test: Using rater reports to help interpret test scores, System, 30, 143, 10.1016/S0346-251X(02)00002-7
Saal, 1980, Rating the ratings: Assessing the psychometric quality of rating data, Psychological Bulletin, 88, 413, 10.1037/0033-2909.88.2.413
Sakyi, 2000, Validation of holistic scoring for ESL writing assessment: How raters evaluate ESL compositions, 129
Santos, 1988, Professors’ reactions to the academic writing of non-native speaking students, TESOL Quarterly, 20, 38
Shohamy, 1992, The effect of raters’ background and training on the reliability of direct writing tests, The Modern Language Journal, 76, 27, 10.1111/j.1540-4781.1992.tb02574.x
Smith, 2000, Rater judgments in the direct assessment of competency-based second language writing ability, 159
Spool, 1978, Training programs for observers of behaviors: A review, Personnel Psychology, 31, 853, 10.1111/j.1744-6570.1978.tb02128.x
Stansfield, 1988, A long-term research agenda for the Test of Written English, Language Testing, 5, 160, 10.1177/026553228800500204
Stock, 1987, Taking on testing: Teachers as testers researchers, English Education, 19, 93
Sweedler-Brown, 1993, ESL essay evaluation: The influences of sentence-level and rhetorical features, Journal of Second Language Writing, 2, 3, 10.1016/1060-3743(93)90003-L
Van Weeren, 1987, Testing pronunciation: An application of generablizability theory, Language Learning, 37, 109, 10.1111/j.1467-1770.1968.tb01314.x
Vann, 1990, Error gravity: Faculty response to errors in the written discourse of nonnative speakers of English, 181
Vaughan, 1991, Holistic assessment: What goes on in the raters’ minds?, 111
Weigle, 1994, Effects of training on raters of ESL compositions, Language Testing, 11, 197, 10.1177/026553229401100206
Weigle, 1998, Using FACETS to model rater training effects, Language Testing, 15, 263, 10.1177/026553229801500205
Weir, 2005
Wiseman, C. (2005). A validation study comparing an analytic scoring rubric and a holistic scoring rubric in the assessment of L2 writing samples. Unpublished paper, Teachers College, Columbia University, NY.