Reliability of multi-category rating scales

Journal of School Psychology - Tập 51 - Trang 217-229 - 2013
Richard I. Parker1, Kimberly J. Vannest1, John L. Davis1
1Texas A&M University at College Station, USA

Tài liệu tham khảo

Agresti, 2002 Altman, 1991 Bailey, 1970, Home-based reinforcement and the modification or pre-delinquents' classroom behavior, Journal of Applied Behavior Analysis, 3, 223, 10.1901/jaba.1970.3-223 Bakeman, 1997, Detecting sequential patterns and determining their reliability with fallible observers, Psychological Methods, 2, 357, 10.1037/1082-989X.2.4.357 Bandalos, 1996, The effects of nonnormality and number of response categories on reliability, Applied Measurement in Education, 9, 151, 10.1207/s15324818ame0902_4 Bendig, 1953, The reliability of self-ratings as a function of the amount of verbal anchoring and the number of categories on the scale, The Journal of Applied Psychology, 37, 38, 10.1037/h0057911 Bendig, 1954, Reliability and the number of rating scale categories, Journal of Applied Psychology, 38, 38, 10.1037/h0055647 Berk, 1979, Generalizability of behavioral observations: A clarification of interobserver agreement and interobserver reliability, American Journal of Mental Deficiency, 83, 460 Bornstein, 1980, Behaviorally specific report cards and self-determined reinforcements: A multiple baseline analysis of inmate offenses, Behavior Modification, 4, 71, 10.1177/014544558041004 Burke, 2009, Reliability of frequent retrospective behavior ratings for elementary school students with EBD, Behavioral Disorders, 34, 212, 10.1177/019874290903400403 Chafouleas, 2002, Good, bad, or in-between: How does the daily behavior report card rate?, Psychology in the Schools, 39, 157, 10.1002/pits.10027 Chafouleas, 2006, Acceptability and reported use of daily behavior report cards among teachers, Journal of Positive Behavior Interventions, 8, 174, 10.1177/10983007060080030601 Christ, 2009, Foundation for the development and use of direct behavior rating (DBR) to assess and evaluate student behavior, Assessment for Effective Intervention, 34, 201, 10.1177/1534508409340390 Cicchetti, 1985, The effect of number of rating scale categories on levels of inter-rater reliability: A Monte-Carlo investigation, Applied Psychological Measurement, 9, 31, 10.1177/014662168500900103 Cohen, 1960, A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20, 37, 10.1177/001316446002000104 Cohen, 1968, Weighed kappa: Nominal scale agreement with provision for scaled disagreement or partial credit, Psychological Bulletin, 70, 213, 10.1037/h0026256 Cone, 1977, The relevance of reliability and validity for behavioral assessment, Behavior Therapy, 8, 411, 10.1016/S0005-7894(77)80077-4 Conners, 2008 Cooper, 2007 Cox, 1980, The optimal number of response alternatives for a scale: A review, Journal of Marketing Research, 17, 407, 10.2307/3150495 Cramer, 1980 Davies, 1989, Effects of a daily report card on disruptive behaviour in primary students, B.C. Journal of Special Education, 13, 173 Dawes, 2007, Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point, 10-point scales, International Journal of Market Research, 50, 61, 10.1177/147078530805000106 Deno, 2003, Developments in curriculum-based measurement, Journal of Special Education, 37, 184, 10.1177/00224669030370030801 Di Eugenio, 2004, The kappa statistic: A second look, Computational Linguistics, 30, 95, 10.1162/089120104773633402 Fabiano, 2009, An investigation of the technical adequacy of a daily behavior report card (DBRC) for monitoring progress of students with attention-deficit/hyperactivity disorder in special education placements, Assessment for Effective Intervention, 34, 231, 10.1177/1534508409333344 Finn, 1972, Effects of some variations in rating scale characteristics on the means and reliabilities of ratings, Educational and Psychological Measurement, 34, 885 Fleiss, 1971, Measuring nominal scale agreement among many raters, Psychological Bulletin, 76, 378, 10.1037/h0031619 Fleiss, 1973, The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability, Educational and Psychological Measurement, 33, 613, 10.1177/001316447303300309 Good, 1998, Contemporary perspectives on curriculum-based measurement validity, 61 Green, 1970, Rating scales and information recovery: How many scales and response categories to use?, Journal of Marketing, 34, 33, 10.2307/1249817 Gresham, 1990 Gwet Gwet, 2010 Hambleton, 2003, Advances in criterion-referenced testing methods and practices, 377 Hintze, 2005, Psychometrics of direct observation, School Psychology Review, 34, 507, 10.1080/02796015.2005.12088012 Hintze, 2006 Irtel, 2005, Psychophysical scaling, Vol. 3, 1628 Jenkins, 1977, A Monte-Carlo study of factors affecting three indices of composite scale reliability, Journal of Applied Psychology, 62, 392, 10.1037/0021-9010.62.4.392 Jurbergs, 2007, School-home notes with and without response cost: Increasing attention and academic performance in low-income children with attention-deficit/hyperactivity disorder, School Psychology Quarterly, 22, 358, 10.1037/1045-3830.22.3.358 Kaplan, 1964 Kelley, 1995, Promoting academic performance in inattentive children. The relative efficacy of school-home notes with and without response cost, Behavior Modification, 19, 357, 10.1177/01454455950193006 Kendall, 1938, A new measure of rank correlation, Biometrika, 30, 81, 10.1093/biomet/30.1-2.81 Kendall, 1970 King, 1999, Goal attainment scaling: Its use in evaluating pediatric therapy programs, Physical & Occupational Therapy in Pediatrics, 19, 31, 10.1300/J006v19n02_03 Kiresuk, 1994 Kline, 2004 Kolbe, 1991, Content analysis research: An examination of applications with directives for improving research reliability and objectivity, Journal of Consumer Research, 18, 243, 10.1086/209256 Krippendorff, 1980 Krosnick, J. A., & Tahk A. M. (2010). The optimal length of rating scales to maximize reliability and validity. Unpublished manuscript. Lissitz, 1975, Effect of the number of scale points on reliability: A Monte-Carlo approach, Journal of Applied Psychology, 60, 10, 10.1037/h0076268 Loken, 1987, The use of 0–10 scales in telephone surveys, Journal of the Market Research Society, 29, 353 Lombard, 2010 Martin, 1973, The effects of scaling on the correlation coefficient: A test of validity, Journal of Marketing Research, 10, 316, 10.2307/3149702 Martin, 1978, Effects of scaling on the correlation coefficient: Additional considerations, Journal of Marketing Research, 15, 314, 10.2307/3151268 Mash, 1974, Situational effects on observer accuracy: Behavioral predictability, prior experience, and complexity of coding categories, Child Development, 45, 367, 10.2307/1127957 McCain, 1993, Managing the classroom behavior of an ADHD preschooler: The efficacy of a school–home note intervention, Child and Family Behavior Therapy, 15, 33, 10.1300/J019v15n03_03 McDermott, 1988, Agreement among diagnosticians or observers: Its importance and determination, Professional School Psychology, 3, 225, 10.1037/h0090563 McKelvie, 1978, Graphic rating scales: How many categories?, British Journal of Psychology, 69, 185, 10.1111/j.2044-8295.1978.tb01647.x Miller, 1956, The magical number seven, plus or minus two: Some limits on our capacity for processing information, Psychological Review, 63, 81, 10.1037/h0043158 Nelson, 2000, Statistical description of interrater variability in ordinal data, Statistical Methods in Medical Research, 9, 475, 10.1191/096228000701555262 Nunnally, 1967 Oaster, 1989, Number of alternatives per choice point and stability of Likert-type scales, Perceptual and Motor Skills, 68, 549, 10.2466/pms.1989.68.2.549 Parker, 2012, Defensible Progress Monitoring Data for Medium-and High-Stakes Decisions, The Journal of Special Education, 46, 141, 10.1177/0022466910376837 Pearson, 1901, On lines and planes of closest fit to systems of points is space, Philosophical Magazine Series, 6, 559, 10.1080/14786440109462720 Pearson, 1904, Mathematical contributions to the theory of evolution. XIII Pelham, 2005, Evidence-based assessment of attention deficit hyperactivity disorder in children and adolescents, Journal of Clinical Child and Adolescent Psychology, 34, 449, 10.1207/s15374424jccp3403_5 Pellegrini, 2001, The role of direct observation in the assessment of young children, Journal of Child Psychology and Psychiatry, 42, 861, 10.1017/S002196300100765X Perreault, 1989, Reliability of nominal data based on qualitative judgments, Journal of Marketing Research, 26, 135, 10.2307/3172601 Popham, 2009, Unraveling reliability, How Teachers Learn, 66, 77 Preston, 2000, Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences, Acta Psychologica, 104, 1, 10.1016/S0001-6918(99)00050-5 Ramsay, 1973, The effect of number of categories in rating scales on precision of estimation of scale values, Psychometrika, 38, 513, 10.1007/BF02291492 Remmers, 1941, Reliability of multiple-choice measuring instruments as a function of the Spearman–Brown prophecy formula, Journal of Educational Psychology, 32, 61, 10.1037/h0061781 Reynolds, 2004 Ripley, 1987 Robert, 2004 Rockwood, 1993, Use of goal attainment scaling in measuring clinically important change in the frail elderly, Journal of Clinical Epidemiology, 46, 1113, 10.1016/0895-4356(93)90110-M Rubinstein, 2007 Rushton, 2002, Goal attainment scaling in the rehabilitation of patients with lower-extremity amputations: A pilot study, Archives of Physical and Medical Rehabilitation, 83, 771, 10.1053/apmr.2002.32636 2009 Schumaker, 1977, An analysis of daily report cards and parent-managed privileges in the improvement of adolescents' classroom performance, Journal of Applied Behavior Analysis, 10, 449, 10.1901/jaba.1977.10-449 Scott, 1955, Reliability of content analysis: The case of nominal scale coding, Public Opinion Quarterly, 17, 321, 10.1086/266577 Siegel, 1988 Sim, 2005, The kappa statistic in reliability studies: Use, interpretation, and sample size requirements, Physical Therapy, 85, 257, 10.1093/ptj/85.3.257 Sluyter, 1972, Delayed reinforcement of classroom behavior by parents, Journal of Learning Disabilities, 5, 16, 10.1177/002221947200500102 Steege, 2001, Reliability and accuracy of a performance-based behavioral recording procedure, School Psychology Review, 30, 252, 10.1080/02796015.2001.12086113 Stemler, 2004, A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability, Practical Assessment, Research & Evaluation, 9 Stevens, 1946, On the theory of scales of measurement, Science, 103, 677, 10.1126/science.103.2684.677 Strijbos, 2006, Content analysis: What are they talking about?, Computers in Education, 46, 29, 10.1016/j.compedu.2005.04.002 Sturman, 2005, The capacity to consent to treatment and research: A review of standardized assessment tools, Clinical Psychology Review, 25, 954, 10.1016/j.cpr.2005.04.010 Suen, 1989 Suen, 1990, A decision tree approach to selecting an appropriate observation reliability index, Journal of Psychopathology and Behavioral Assessment, 12, 359, 10.1007/BF00965989 Suen, 1985, Effects of the use of percentage agreement on behavioral observation reliabilities: A reassessment, Journal of Psychopathology and Behavioral Assessment, 7, 221, 10.1007/BF00960754 Svensson, 2001, Guidelines to statistical evaluation of data from rating scales and questionnaires, Journal of Rehabilitation Medicine, 33, 47, 10.1080/165019701300006542 Taplin, 1973, Effects of instructional set and experimenter influence on observer reliability, Child Development, 44, 547, 10.2307/1128011 Thompson, 2000, Psychometrics is datametrics: The test is not reliable, Educational and Psychological Measurement, 60, 174, 10.1177/0013164400602002 Thurstone, 1948, Psychophysical methods Torgerson, 1958 Uebersax, 1987, Diversity of decision-making models and the measurement of interrater agreement, Psychological Bulletin, 101, 140, 10.1037/0033-2909.101.1.140 Vannest, 2008 VassarStats Velicer, 1978, The relation between item format and the structure of the Eysenck Personality Inventory, Applied Psychological Measurement, 2, 293, 10.1177/014662167800200210 Viera, 2005, Understanding interobserver agreement: The kappa Statistic, Family Medicine, 37, 360 Watkins, 2000, Interobserver agreement in behavioral research: Importance and calculation, Journal of Behavioral Education, 10, 205, 10.1023/A:1012295615144 Wehmeyer, 2000, Promoting causal agency: The self determined learning model of instruction, Exceptional Children, 66, 439, 10.1177/001440290006600401