Reliability of multi-category rating scales
Tài liệu tham khảo
Agresti, 2002
Altman, 1991
Bailey, 1970, Home-based reinforcement and the modification or pre-delinquents' classroom behavior, Journal of Applied Behavior Analysis, 3, 223, 10.1901/jaba.1970.3-223
Bakeman, 1997, Detecting sequential patterns and determining their reliability with fallible observers, Psychological Methods, 2, 357, 10.1037/1082-989X.2.4.357
Bandalos, 1996, The effects of nonnormality and number of response categories on reliability, Applied Measurement in Education, 9, 151, 10.1207/s15324818ame0902_4
Bendig, 1953, The reliability of self-ratings as a function of the amount of verbal anchoring and the number of categories on the scale, The Journal of Applied Psychology, 37, 38, 10.1037/h0057911
Bendig, 1954, Reliability and the number of rating scale categories, Journal of Applied Psychology, 38, 38, 10.1037/h0055647
Berk, 1979, Generalizability of behavioral observations: A clarification of interobserver agreement and interobserver reliability, American Journal of Mental Deficiency, 83, 460
Bornstein, 1980, Behaviorally specific report cards and self-determined reinforcements: A multiple baseline analysis of inmate offenses, Behavior Modification, 4, 71, 10.1177/014544558041004
Burke, 2009, Reliability of frequent retrospective behavior ratings for elementary school students with EBD, Behavioral Disorders, 34, 212, 10.1177/019874290903400403
Chafouleas, 2002, Good, bad, or in-between: How does the daily behavior report card rate?, Psychology in the Schools, 39, 157, 10.1002/pits.10027
Chafouleas, 2006, Acceptability and reported use of daily behavior report cards among teachers, Journal of Positive Behavior Interventions, 8, 174, 10.1177/10983007060080030601
Christ, 2009, Foundation for the development and use of direct behavior rating (DBR) to assess and evaluate student behavior, Assessment for Effective Intervention, 34, 201, 10.1177/1534508409340390
Cicchetti, 1985, The effect of number of rating scale categories on levels of inter-rater reliability: A Monte-Carlo investigation, Applied Psychological Measurement, 9, 31, 10.1177/014662168500900103
Cohen, 1960, A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20, 37, 10.1177/001316446002000104
Cohen, 1968, Weighed kappa: Nominal scale agreement with provision for scaled disagreement or partial credit, Psychological Bulletin, 70, 213, 10.1037/h0026256
Cone, 1977, The relevance of reliability and validity for behavioral assessment, Behavior Therapy, 8, 411, 10.1016/S0005-7894(77)80077-4
Conners, 2008
Cooper, 2007
Cox, 1980, The optimal number of response alternatives for a scale: A review, Journal of Marketing Research, 17, 407, 10.2307/3150495
Cramer, 1980
Davies, 1989, Effects of a daily report card on disruptive behaviour in primary students, B.C. Journal of Special Education, 13, 173
Dawes, 2007, Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point, 10-point scales, International Journal of Market Research, 50, 61, 10.1177/147078530805000106
Deno, 2003, Developments in curriculum-based measurement, Journal of Special Education, 37, 184, 10.1177/00224669030370030801
Di Eugenio, 2004, The kappa statistic: A second look, Computational Linguistics, 30, 95, 10.1162/089120104773633402
Fabiano, 2009, An investigation of the technical adequacy of a daily behavior report card (DBRC) for monitoring progress of students with attention-deficit/hyperactivity disorder in special education placements, Assessment for Effective Intervention, 34, 231, 10.1177/1534508409333344
Finn, 1972, Effects of some variations in rating scale characteristics on the means and reliabilities of ratings, Educational and Psychological Measurement, 34, 885
Fleiss, 1971, Measuring nominal scale agreement among many raters, Psychological Bulletin, 76, 378, 10.1037/h0031619
Fleiss, 1973, The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability, Educational and Psychological Measurement, 33, 613, 10.1177/001316447303300309
Good, 1998, Contemporary perspectives on curriculum-based measurement validity, 61
Green, 1970, Rating scales and information recovery: How many scales and response categories to use?, Journal of Marketing, 34, 33, 10.2307/1249817
Gresham, 1990
Gwet
Gwet, 2010
Hambleton, 2003, Advances in criterion-referenced testing methods and practices, 377
Hintze, 2005, Psychometrics of direct observation, School Psychology Review, 34, 507, 10.1080/02796015.2005.12088012
Hintze, 2006
Irtel, 2005, Psychophysical scaling, Vol. 3, 1628
Jenkins, 1977, A Monte-Carlo study of factors affecting three indices of composite scale reliability, Journal of Applied Psychology, 62, 392, 10.1037/0021-9010.62.4.392
Jurbergs, 2007, School-home notes with and without response cost: Increasing attention and academic performance in low-income children with attention-deficit/hyperactivity disorder, School Psychology Quarterly, 22, 358, 10.1037/1045-3830.22.3.358
Kaplan, 1964
Kelley, 1995, Promoting academic performance in inattentive children. The relative efficacy of school-home notes with and without response cost, Behavior Modification, 19, 357, 10.1177/01454455950193006
Kendall, 1938, A new measure of rank correlation, Biometrika, 30, 81, 10.1093/biomet/30.1-2.81
Kendall, 1970
King, 1999, Goal attainment scaling: Its use in evaluating pediatric therapy programs, Physical & Occupational Therapy in Pediatrics, 19, 31, 10.1300/J006v19n02_03
Kiresuk, 1994
Kline, 2004
Kolbe, 1991, Content analysis research: An examination of applications with directives for improving research reliability and objectivity, Journal of Consumer Research, 18, 243, 10.1086/209256
Krippendorff, 1980
Krosnick, J. A., & Tahk A. M. (2010). The optimal length of rating scales to maximize reliability and validity. Unpublished manuscript.
Lissitz, 1975, Effect of the number of scale points on reliability: A Monte-Carlo approach, Journal of Applied Psychology, 60, 10, 10.1037/h0076268
Loken, 1987, The use of 0–10 scales in telephone surveys, Journal of the Market Research Society, 29, 353
Lombard, 2010
Martin, 1973, The effects of scaling on the correlation coefficient: A test of validity, Journal of Marketing Research, 10, 316, 10.2307/3149702
Martin, 1978, Effects of scaling on the correlation coefficient: Additional considerations, Journal of Marketing Research, 15, 314, 10.2307/3151268
Mash, 1974, Situational effects on observer accuracy: Behavioral predictability, prior experience, and complexity of coding categories, Child Development, 45, 367, 10.2307/1127957
McCain, 1993, Managing the classroom behavior of an ADHD preschooler: The efficacy of a school–home note intervention, Child and Family Behavior Therapy, 15, 33, 10.1300/J019v15n03_03
McDermott, 1988, Agreement among diagnosticians or observers: Its importance and determination, Professional School Psychology, 3, 225, 10.1037/h0090563
McKelvie, 1978, Graphic rating scales: How many categories?, British Journal of Psychology, 69, 185, 10.1111/j.2044-8295.1978.tb01647.x
Miller, 1956, The magical number seven, plus or minus two: Some limits on our capacity for processing information, Psychological Review, 63, 81, 10.1037/h0043158
Nelson, 2000, Statistical description of interrater variability in ordinal data, Statistical Methods in Medical Research, 9, 475, 10.1191/096228000701555262
Nunnally, 1967
Oaster, 1989, Number of alternatives per choice point and stability of Likert-type scales, Perceptual and Motor Skills, 68, 549, 10.2466/pms.1989.68.2.549
Parker, 2012, Defensible Progress Monitoring Data for Medium-and High-Stakes Decisions, The Journal of Special Education, 46, 141, 10.1177/0022466910376837
Pearson, 1901, On lines and planes of closest fit to systems of points is space, Philosophical Magazine Series, 6, 559, 10.1080/14786440109462720
Pearson, 1904, Mathematical contributions to the theory of evolution. XIII
Pelham, 2005, Evidence-based assessment of attention deficit hyperactivity disorder in children and adolescents, Journal of Clinical Child and Adolescent Psychology, 34, 449, 10.1207/s15374424jccp3403_5
Pellegrini, 2001, The role of direct observation in the assessment of young children, Journal of Child Psychology and Psychiatry, 42, 861, 10.1017/S002196300100765X
Perreault, 1989, Reliability of nominal data based on qualitative judgments, Journal of Marketing Research, 26, 135, 10.2307/3172601
Popham, 2009, Unraveling reliability, How Teachers Learn, 66, 77
Preston, 2000, Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences, Acta Psychologica, 104, 1, 10.1016/S0001-6918(99)00050-5
Ramsay, 1973, The effect of number of categories in rating scales on precision of estimation of scale values, Psychometrika, 38, 513, 10.1007/BF02291492
Remmers, 1941, Reliability of multiple-choice measuring instruments as a function of the Spearman–Brown prophecy formula, Journal of Educational Psychology, 32, 61, 10.1037/h0061781
Reynolds, 2004
Ripley, 1987
Robert, 2004
Rockwood, 1993, Use of goal attainment scaling in measuring clinically important change in the frail elderly, Journal of Clinical Epidemiology, 46, 1113, 10.1016/0895-4356(93)90110-M
Rubinstein, 2007
Rushton, 2002, Goal attainment scaling in the rehabilitation of patients with lower-extremity amputations: A pilot study, Archives of Physical and Medical Rehabilitation, 83, 771, 10.1053/apmr.2002.32636
2009
Schumaker, 1977, An analysis of daily report cards and parent-managed privileges in the improvement of adolescents' classroom performance, Journal of Applied Behavior Analysis, 10, 449, 10.1901/jaba.1977.10-449
Scott, 1955, Reliability of content analysis: The case of nominal scale coding, Public Opinion Quarterly, 17, 321, 10.1086/266577
Siegel, 1988
Sim, 2005, The kappa statistic in reliability studies: Use, interpretation, and sample size requirements, Physical Therapy, 85, 257, 10.1093/ptj/85.3.257
Sluyter, 1972, Delayed reinforcement of classroom behavior by parents, Journal of Learning Disabilities, 5, 16, 10.1177/002221947200500102
Steege, 2001, Reliability and accuracy of a performance-based behavioral recording procedure, School Psychology Review, 30, 252, 10.1080/02796015.2001.12086113
Stemler, 2004, A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability, Practical Assessment, Research & Evaluation, 9
Stevens, 1946, On the theory of scales of measurement, Science, 103, 677, 10.1126/science.103.2684.677
Strijbos, 2006, Content analysis: What are they talking about?, Computers in Education, 46, 29, 10.1016/j.compedu.2005.04.002
Sturman, 2005, The capacity to consent to treatment and research: A review of standardized assessment tools, Clinical Psychology Review, 25, 954, 10.1016/j.cpr.2005.04.010
Suen, 1989
Suen, 1990, A decision tree approach to selecting an appropriate observation reliability index, Journal of Psychopathology and Behavioral Assessment, 12, 359, 10.1007/BF00965989
Suen, 1985, Effects of the use of percentage agreement on behavioral observation reliabilities: A reassessment, Journal of Psychopathology and Behavioral Assessment, 7, 221, 10.1007/BF00960754
Svensson, 2001, Guidelines to statistical evaluation of data from rating scales and questionnaires, Journal of Rehabilitation Medicine, 33, 47, 10.1080/165019701300006542
Taplin, 1973, Effects of instructional set and experimenter influence on observer reliability, Child Development, 44, 547, 10.2307/1128011
Thompson, 2000, Psychometrics is datametrics: The test is not reliable, Educational and Psychological Measurement, 60, 174, 10.1177/0013164400602002
Thurstone, 1948, Psychophysical methods
Torgerson, 1958
Uebersax, 1987, Diversity of decision-making models and the measurement of interrater agreement, Psychological Bulletin, 101, 140, 10.1037/0033-2909.101.1.140
Vannest, 2008
VassarStats
Velicer, 1978, The relation between item format and the structure of the Eysenck Personality Inventory, Applied Psychological Measurement, 2, 293, 10.1177/014662167800200210
Viera, 2005, Understanding interobserver agreement: The kappa Statistic, Family Medicine, 37, 360
Watkins, 2000, Interobserver agreement in behavioral research: Importance and calculation, Journal of Behavioral Education, 10, 205, 10.1023/A:1012295615144
Wehmeyer, 2000, Promoting causal agency: The self determined learning model of instruction, Exceptional Children, 66, 439, 10.1177/001440290006600401