Comparing the accuracy of different scoring methods for identifying sixth graders at risk of failing a state writing assessment

Assessing Writing - Tập 27 - Trang 11-23 - 2016
Joshua Wilson1, Natalie G. Olinghouse2, D. Betsy McCoach2, Tanya Santangelo3, Gilbert N. Andrada4
1School of Education, University of Delaware, Newark, DE, United States
2Department of Educational Psychology, University of Connecticut, Mansfield, CT, United States
3Department of Education, Arcadia University, Glenside, PA, United States
4Psychometrics and Applied Research, Bureau of Student Assessment, Connecticut State Department of Education, Hartford, CT, United States

Tài liệu tham khảo

American Educational Research Association, 2014 Berninger, 1995, Integrating low- and high-level skills in instructional protocols for writing disabilities, Learning Disability Quarterly, 18, 293, 10.2307/1511235 Berninger, 2008, Writing problems in developmental dyslexia: Under-recognized and under-treated, Journal of School Psychology, 46, 1, 10.1016/j.jsp.2006.11.008 Berninger, 1994, Modifying Hayes and Flower's model of skilled writing to explain beginning and developing writing, Advances in Cognition and Educational Practice, 2, 57 Bulkely, 2010, Introduction to the special issue on benchmarks for success? Interim assessments as a strategy for educational improvement, Peabody Journal of Education, 85, 115, 10.1080/01619561003673920 Cronbach, 1988, Five perspectives on validity argument, 3 Cohen, 2003 Coker, 2014, Universal screening for writing risk in kindergarten, Assessment for Effective Intervention, 39, 245, 10.1177/1534508413502389 Common Core State Standards Initiative, 2010 Compton, 2010, Selecting at-risk first-grade readers for early intervention: Eliminating false positives and exploring the promise of a two-stage gated screening procedure, Journal of Educational Psychology, 102, 327, 10.1037/a0018448 Darling-Hammond, 2004, Standards, accountability, and school reform, Teachers College Record, 106, 1047, 10.1111/j.1467-9620.2004.00372.x Deane, 2013, On the relation between automated essay scoring and modern views of the writing construct, Assessing Writing, 18, 7, 10.1016/j.asw.2012.10.002 Decker, 2008, Challenges and opportunities for promoting student achievement through large-scale assessment results: Research, reflections, and future directions, Assessment for Effective Intervention, 34, 43, 10.1177/1534508408314173 Elliot, 2005 Espin, 2005, The relationship between curriculum-based measures in written expression and quality and completeness of expository writing for middle school students, The Journal of Special Education, 38, 208, 10.1177/00224669050380040201 Espin, 2000, Identifying indicators of written expression proficiency for middle school students, The Journal of Special Education, 34, 140, 10.1177/002246690003400303 Fewster, 2002, School-based evidence for the validity of curriculum-based measurement of reading and writing, Remedial and Special Education, 23, 149, 10.1177/07419325020230030301 Field, 2013 Fielding, 1997, A review of methods for the assessment of prediction errors in conservation presence/absence models, Environmental Conservation, 24, 39, 10.1017/S0376892997000088 Figlio, 2002 Flower, 1980, The dynamics of composing: Making plans and juggling constraints, 3 Folz, 2014, Improving student writing through automated formative assessment: Practices and results Fuchs, 2012, Smart RTI: A next-generation approach to multilevel prevention, Exceptional Children, 78, 263, 10.1177/001440291207800301 Gansle, 2004, An examination of the criterion validity and sensitivity to brief intervention of alternate curriculum-based measures of writing skill, Psychology in the Schools, 41, 291, 10.1002/pits.10166 Gansle, 2002, Moving beyond total words written: The reliability, criterion validity, and time cost of alternate measures for curriculum-based measurement in writing, School Psychology Review, 31, 477, 10.1080/02796015.2002.12086169 Godshalk, 1966 Goertz, 2003, Mapping the landscape of high-stakes testing and accountability programs, Theory into Practice, 42, 4, 10.1207/s15430421tip4201_2 Graham, 2011 Graham, 2011, Throw ‘em out or make ‘em better?. State and district high-stakes writing assessments, Focus on Exceptional Children, 44, 1, 10.17161/foec.v44i1.6913 Graham, 2014, Assessing the writing achievement of young struggling writers: Application of generalizability theory, Learning Disability Quarterly Graham, 2007 Hamilton, 2007 Haney, 2000, The myth of the Texas miracle in education, Education Policy Analysis Archives, 8 Hanley, 1983, A method of comparing the areas under receiver operating characteristic curves derived from the same cases, Radiology, 148, 839, 10.1148/radiology.148.3.6878708 Hayes, 2012, Modeling and remodeling writing, Written Communication, 29, 369, 10.1177/0741088312451260 1999 Hosmer, 2013 Huot, 1990, Reliability, validity, and holistic scoring: What we know and what we need to know, College Composition and Communication, 41, 201, 10.2307/358160 Jenkins, 2007, Screening for at-risk readers in a response to intervention framework, School Psychology Review, 36, 582, 10.1080/02796015.2007.12087919 Johnson, 2009, How can we improve the accuracy of screening instruments?, Learning Disabilities Research and Practice, 24, 174, 10.1111/j.1540-5826.2009.00291.x Jones, 1999, The impacts of high-stakes testing on teachers and students in North Carolina, Phi Delta Kappan, 81, 199 Kane, 1992, An argument-based approach to validity, Psychological Bulletin, 112, 527, 10.1037/0033-2909.112.3.527 Kane, 2006, Validation, 17 Keith, 2003, Validity and automated essay scoring systems, 147 Kellogg, 2009, Training advanced writing skills: The case for deliberative practice, Educational Psychologist, 44, 250, 10.1080/00461520903213600 Lopez, 2011, The relationship among measures of written expression using curriculum-based measurement and the Arizona Instrument to Measure Skills (AIMS) at the middle school level, Reading and Writing Quarterly, 27, 129, 10.1080/10573561003769640 McCutchen, 1988, “Functional automaticity” in children's writing, Written Communication, 5, 306, 10.1177/0741088388005003003 McCutchen, 1996, A capacity theory of writing: Working memory in composition, Educational Psychology Review, 8, 299, 10.1007/BF01464076 McCutchen, 2011, From novice to expert: Implications of language skills and writing-relevant knowledge for memory during the development of writing skill, Journal of Writing Research, 3, 51, 10.17239/jowr-2011.03.01.3 McMaster, 2008, New and existing curriculum-based writing measures: Technical features within and across grades, School Psychology Review, 37, 550, 10.1080/02796015.2008.12087867 McMaster, 2012, Use of curriculum-based measurement for beginning writers within a response to intervention framework, Reading Psychology, 33, 190, 10.1080/02702711.2012.631867 Meehl, 1955, Antecedent probability and the efficiency of psychometric signs, patterns, or cutting scores, Psychological Bulletin, 52, 194, 10.1037/h0048070 Messick, 1989, Validity, 13 Mislevy, 2006, Implications of evidence-centered design for educational testing, Educational Measurement: Issues and Practice, 25, 6, 10.1111/j.1745-3992.2006.00075.x National Center for Education Statistics, 2012 Nelson, 2007, Measuring written language ability in narrative samples, Reading and Writing Quarterly, 23, 287, 10.1080/10573560701277807 Olinghouse, 2013, The relationship between vocabulary and writing quality in three genres, Reading and Writing, 26, 45, 10.1007/s11145-012-9392-5 Olinghouse, 2012, State writing assessment: Inclusion of motivational factors in writing tasks, Reading and Writing Quarterly, 28, 97, 10.1080/10573569.2012.632736 Page, 1966, The imminence of grading essays by computer, Phi Delta Kappan, 48, 238 Page, 1994, Computer grading of student prose, using modern concepts and software, The Journal of Experimental Education, 62, 127, 10.1080/00220973.1994.9943835 Page, 2003, Project essay grade: PEG, 43 Page, 1997, Computer analysis of student essays: Finding trait differences in student profile Parker, 1991, Countable indices of writing quality: Their suitability for screening-eligibility decisions, Exceptionality, 2, 1, 10.1080/09362839109524763 Penny, 2000, The effect of rating augmentation on inter-rater reliability: An empirical study of a holistic rubric, Assessing Writing, 7, 143, 10.1016/S1075-2935(00)00012-X Perelman, 2012, Construct validity, length, score, and time in holistically graded writing assessments: The case against automated essay scoring (AES), 121 Perelman, 2014, When “the state of the art” is counting words, Assessing Writing, 21, 104, 10.1016/j.asw.2014.05.001 Persky, 2002 Perie, 2009, Moving toward a comprehensive assessment system: A framework for considering interim assessments, Educational Measurement: Issues and Practice, 28, 5, 10.1111/j.1745-3992.2009.00149.x Ritchey, 2014, Identifying writing difficulties in first grade: An investigation of writing and reading measures, Learning Disabilities Research and Practice, 29, 54, 10.1111/ldrp.12030 Scott, 2000, General language performance measures in spoken and written narrative and expository discourse of school-age children with language learning disabilities, Journal of Speech, Language, and Hearing Research, 43, 324, 10.1044/jslhr.4302.324 Shermis, 2014, State-of-the-art automated essay scoring: Competition, results, and future directions from a United States demonstration, Assessing Writing, 20, 53, 10.1016/j.asw.2013.04.001 Shermis, 2002, Trait ratings for automated essay grading, Educational and Psychological Measurement, 62, 5, 10.1177/0013164402062001001 Shermis, 2001, On-line grading of student essays: PEG goes on the World Wide Web, Assessment and Evaluation in Higher Education, 26, 247, 10.1080/02602930120052404 Warschauer, 2008, Automated writing assessment in the classroom, Pedagogies: An International Journal, 3, 22, 10.1080/15544800701771580 Wilson, 1985, Early-screening programs: When is predictive accuracy sufficient?, Learning Disability Quarterly, 8, 182, 10.2307/1510892 Zweig, 1993, Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine, Clinical Chemistry, 39, 561, 10.1093/clinchem/39.4.561