Inequalities between multi-rater kappas

Advances in Data Analysis and Classification - Tập 4 Số 4 - Trang 271-286 - 2010
Matthijs J. Warrens1
1Unit Methodology and Statistics, Institute of Psychology, Leiden University, Leiden, The Netherlands 2300 RB#TAB#

Tóm tắt

Từ khóa


Tài liệu tham khảo

Artstein R, Poesio M (2005) Kappa3 = Alpha (or Beta). NLE Technical Note 05-1, University of Essex

Banerjee M, Capozzoli M, McSweeney L, Sinha D (1999) Beyond kappa: a review of interrater agreement measures. Can J Stat 27: 3–23

Bennett EM, Alpert R, Goldstein AC (1954) Communications through limited response questioning. Public Opin Q 18: 303–308

Berry KJ, Mielke PW (1988) A generalization of Cohen’s kappa agreement measure to interval measurement and multiple raters. Educ Psychol Meas 48: 921–933

Brennan RL, Prediger DJ (1981) Coefficient kappa: some uses, misuses, and alternatives. Edu Psychol Meas 41: 687–699

Cohen J (1960) A coefficient of agreement for nominal scales. Edu Psychol Meas 20: 37–46

Cohen J (1968) Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70: 213–220

Conger AJ (1980) Integration and generalization of kappas for multiple raters. Psychol Bull 88: 322–328

Craig RT (1981) Generalization of Scott’s index of intercoder agreement. Public Opin Q 45: 260–264

Davies M, Fleiss JL (1982) Measuring agreement for multinomial data. Biometrics 38: 1047–1051

De Mast J (2007) Agreement and kappa-type indices. Am Stat 61: 148–153

Di Eugenio B, Glass M (2004) The kappa statistic: a second look. Comput Linguist 30: 95–101

Dou W, Ren Y, Wu Q, Ruan S, Chen Y, Bloyet D, Constans J-M (2007) Fuzzy kappa for the agreement measure of fuzzy classifications. Neurocomputing 70: 726–734

Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76: 378–382

Gwet KL (2008) Variance estimation of nominal-scale inter-rater reliability with random selection of raters. Psychometrika 73: 407–430

Heuvelmans APJM, Sanders PF (1993) Beoordelaarsovereenstemming. In: Eggen TJHM, Sanders PF (eds) Psychometrie in de Praktijk. Cito Instituut voor Toestontwikkeling, Arnhem, pp 443–470

Hsu LM, Field R (2003) Interrater agreement measures: comments on kappa n, Cohen’s kappa, Scott’s π and Aickin’s α. Underst Stat 2: 205–219

Hubert L (1977) Kappa revisited. Psychol Bull 84: 289–297

Janes CL (1979) An extension of the random error coefficient of agreement to N × N tables. Br J Psychiatry 134: 617–619

Janson H, Olsson U (2001) A measure of agreement for interval or nominal multivariate observations. Educ Psychol Meas 61: 277–289

Janson S, Vegelius J (1979) On generalizations of the G index and the Phi coefficient to nominal scales. Multivar Behav Res 14: 255–269

Kraemer HC (1979) Ramifications of a population model for κ as a coefficient of reliability. Psychometrika 44: 461–472

Kraemer HC (1980) Extensions of the kappa coefficient. Biometrics 36: 207–216

Kraemer HC, Periyakoil VS, Noda A (2002) Tutorial in biostatistics: kappa coefficients in medical research. Stat Med 21: 2109–2129

Krippendorff K (1987) Association, agreement, and equity. Qual Quant 21: 109–123

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33: 159–174

Light RJ (1971) Measures of response agreement for qualitative data: some generalizations and alternatives. Psychol Bull 76: 365–377

Mitrinović DS (1964) Elementary inequalities. P. Noordhoff, Groningen

O’Malley FP, Mohsin SK, Badve S, Bose S, Collins LC, Ennis M, Kleer CG, Pinder SE, Schnitt SJ (2006) Interobserver reproducibility in the diagnosis of flat epithelial atypia of the breast. Mod Pathol 19: 172–179

Popping R (1983) Overeenstemmingsmaten voor nominale data. PhD thesis, Rijksuniversiteit Groningen, Groningen

Randolph JJ (2005) Free-marginal multirater kappa (multirater κ free): an alternative to Fleiss’ fixed-Marginal multirater kappa. Paper presented at the Joensuu Learning and Instruction Symposium, Joensuu, Finland

Schouten HJA (1980) Measuring agreement among many observers. Biom J 22: 497–504

Schouten HJA (1982) Measuring pairwise agreement among many observers. Biom J 24: 431–435

Schouten HJA (1986) Nominal scale agreement among observers. Psychometrika 51: 453–466

Scott WA (1955) Reliability of content analysis: the case of nominal scale coding. Public Opin Q 19: 321–325

Vanbelle S, Albert A (2009) A note on the linearly weighted kappa coefficient for ordinal scales. Stat Methodol 6: 157–163

Warrens MJ (2008a) On similarity coefficients for 2 × 2 tables and correction for chance. Psychometrika 73: 487–502

Warrens MJ (2008b) Bounds of resemblance measures for binary (presence/absence) variables. J Classif 25: 195–208

Warrens MJ (2008c) On association coefficients for 2 × 2 tables and properties that do not depend on the marginal distributions. Psychometrika 73: 777–789

Warrens MJ (2008d) On the equivalence of Cohen’s kappa and the Hubert-Arabie adjusted Rand index. J Classif 25: 177–183

Warrens MJ (2008e) On the indeterminacy of resemblance measures for (presence/absence) data. J Classif 25: 125–136

Warrens MJ (2010a) Inequalities between kappa and kappa-like statistics for k × k tables. Psychometrika 75: 176–185

Warrens MJ (2010b) A formal proof of a paradox associated with Cohen’s kappa. J Classif (in press)

Warrens MJ (2010c) Cohen’s kappa can always be increased and decreased by combining categories. Stat Methodol 7: 673–677

Warrens MJ (2010d) A Kraemer-type rescaling that transforms the odds ratio into the weighted kappa coefficient. Psychometrika 75: 328–330

Zwick R (1988) Another look at interrater agreement. Psychol Bull 103: 374–378