The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements
Tóm tắt
Từ khóa
Tài liệu tham khảo
Toussaint, 1999, Sacroiliac joint diagnostics in the Hamburg Construction Workers Study, J Manipulative Physiol Ther, 22, 139, 10.1016/S0161-4754(99)70126-0
Fritz, 2000, The use of a classification approach to identify subgroups of patients with acute low back pain, Spine, 25, 106, 10.1097/00007632-200001010-00018
Riddle, 2002, Evaluation of the presence of sacroiliac joint region dysfunction using a combination of tests: a multicenter intertester reliability study, Phys Ther, 82, 772, 10.1093/ptj/82.8.772
Petersen, 2004, Inter-tester reliability of a new diagnostic classification system for patients with non-specific low back pain, Aust J Physiother, 50, 85, 10.1016/S0004-9514(14)60100-8
Fjellner, 1999, Interexaminer reliability in physical examination of the cervical spine, J Manipulative Physiol Ther, 22, 511, 10.1016/S0161-4754(99)70002-3
Hawk, 1999, Preliminary study of the reliability of assessment procedures for indications for chiropractic adjustments of the lumbar spine, J Manipulative Physiol Ther, 2, 382, 10.1016/S0161-4754(99)70083-7
Smedmark, 2000, Inter-examiner reliability in assessing passive intervertebral motion of the cervical spine, Man Ther, 5, 97, 10.1054/math.2000.0234
Hayes, 2001, Reliability of assessing end-feel and pain and resistance sequences in subjects with painful shoulders and knees, J Orthop Sports Phys Ther, 31, 432, 10.2519/jospt.2001.31.8.432
Kilpikoski, 2002, Interexaminer reliability of low back pain assessment using the McKenzie method, Spine, 27, E207, 10.1097/00007632-200204150-00016
Hannes, 2002, Multisurgeon assessment of coronal pattern classifications systems for adolescent idiopathic scoliosis, Spine, 27, 762, 10.1097/00007632-200204010-00015
Speciale, 2002, Observer variability in assessing lumbar spinal stenosis severity on magnetic resonance imaging and its relation to cross-sectional spinal canal area, Spine, 27, 1082, 10.1097/00007632-200205150-00014
Richards, 2003, Comparison of reliability between the Lenke and King classification systems for adolescent idiopathic scoliosis using radiographs that were not premeasured, Spine, 28, 1148, 10.1097/01.BRS.0000067265.52473.C3
Sim, 2000, Research in Health Care: Concepts, Designs and Methods
Cohen, 1960, A coefficient of agreement for nominal scales, Educ Psychol Meas, 20, 37, 10.1177/001316446002000104
Conger, 1980, Integration and generalization of kappas for multiple raters, Psychol Bull, 88, 322, 10.1037/0033-2909.88.2.322
Haley, 1989, Kappa coefficient calculation using multiple ratings per subject: a special communication, Phys Ther, 69, 970, 10.1093/ptj/69.11.970
Fleiss, 1971, Measuring nominal scale agreement among many raters, Psychol Bull, 76, 378, 10.1037/h0031619
Fritz, 2001, Examining diagnostic tests: an evidence-based perspective, Phys Ther, 81, 1546, 10.1093/ptj/81.9.1546
Feinstein, 1990, High agreement but low kappa, I: the problems of two paradoxes, J Clin Epidemiol, 43, 543, 10.1016/0895-4356(90)90158-L
Bartko, 1976, On the methods and theory of reliability, J Nerv Ment Dis, 163, 307, 10.1097/00005053-197611000-00003
Fleiss, 1979, Large sample variance of kappa in the case of different sets of raters, Psychol Bull, 86, 974, 10.1037/0033-2909.86.5.974
Hartmann, 1977, Considerations in the choice of interobserver reliability estimates, J Appl Behav Anal, 10, 103, 10.1901/jaba.1977.10-103
Rigby, 2000, Statistical methods in epidemiology, V: towards an understanding of the kappa coefficient, Disabil Rehabil, 22, 339, 10.1080/096382800296575
Cohen, 1968, Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit, Psychol Bull, 70, 213, 10.1037/h0026256
Lantz, 1997, Application and evaluation of the kappa statistic in the design and interpretation of chiropractic clinical research, J Manipulative Physiol Ther, 20, 521
McKenzie, 1981, The Lumbar Spine: Mechanical Diagnosis and Therapy
Brennan, 1992, Statistical methods for assessing observer variability in clinical measures, BMJ, 304, 1491, 10.1136/bmj.304.6840.1491
Donner, 1994, Statistical implications of the choice between a dichotomous or continuous trait in studies of interobserver agreement, Biometrics, 50, 550, 10.2307/2533400
Bartfay, 2000, The effect of collapsing multinomial data when assessing agreement, Int J Epidemiol, 29, 1070, 10.1093/ije/29.6.1070
Maclure, 1987, Misinterpretation and misuse of the kappa statistic, Am J Epidemiol, 126, 161, 10.1093/aje/126.2.161
Streiner, 2003, Health Measurement Scales: A Practical Guide to their Development and Use
Stratford, 1997, Use of the standard error as a reliability index: an applied example using elbow flexor strength, Phys Ther, 77, 745, 10.1093/ptj/77.7.745
Bland, 1986, Statistical methods for assessing agreement between two methods of clinical measurement, Lancet, 1, 307, 10.1016/S0140-6736(86)90837-8
Bannerjee, 1997, Interpreting kappa values for two-observer nursing diagnosis data, Res Nurs Health, 20, 465, 10.1002/(SICI)1098-240X(199710)20:5<465::AID-NUR10>3.0.CO;2-8
Thompson, 1988, A reappraisal of the kappa coefficient, J Clin Epidemiol, 41, 949, 10.1016/0895-4356(88)90031-5
Shoukri, 2004, Measures of Interobserver Agreement
Brennan, 1981, Coefficient kappa: some uses, misuses, and alternatives, Educ Psychol Meas, 41, 687, 10.1177/001316448104100307
Hoehler, 2000, Bias and prevalence effects on kappa viewed in terms of sensitivity and specificity, J Clin Epidemiol, 53, 499, 10.1016/S0895-4356(99)00174-2
Cicchetti, 1990, High agreement but low kappa, II: resolving the paradoxes, J Clin Epidemiol, 43, 551, 10.1016/0895-4356(90)90159-M
Lantz, 1996, Behavior and interpretation of the κ statistic: resolution of the two paradoxes, J Clin Epidemiol, 49, 431, 10.1016/0895-4356(95)00571-4
Gjørup, 1988, The kappa coefficient and the prevalence of a diagnosis, Methods Inf Med, 27, 184, 10.1055/s-0038-1635539
Landis, 1977, The measurement of observer agreement for categorical data, Biometrics, 33, 159, 10.2307/2529310
Fleiss, 1981, Statistical Methods for Rates and Proportions
Altman, 1991, Practical Statistics for Medical Research
Shrout, 1998, Measurement reliability and agreement in psychiatry, Stat Methods Med Res, 7, 301, 10.1177/096228029800700306
Dunn, 1989, Design and Analysis of Reliability Studies: The Statistical Evaluation of Measurement Errors
Brenner, 1996, Dependence of weighted kappa coefficients on the number of categories, Epidemiology, 7, 199, 10.1097/00001648-199603000-00016
Haas, 1991, Statistical methodology for reliability studies, J Manipulative Physiol Ther, 14, 119
Soeken, 1986, Issues in the use of kappa to estimate reliability, Med Care, 24, 733, 10.1097/00005650-198608000-00008
Knight, 1998, The validity of self-reported cocaine use in a criminal justice treatment sample, Am J Drug Alcohol Abuse, 24, 647, 10.3109/00952999809019614
Posner, 1990, Measuring interrater reliability among multiple raters: an example of methods for nominal data, Stat Med, 9, 1103, 10.1002/sim.4780090917
Petersen, 1998, Using the kappa coefficient as a measure of reliability or reproducibility, Chest, 114, 946, 10.1378/chest.114.3.946-a
Main, 1992, The distress and risk assessment method: a simple patient classification to identify distress and evaluate the risk of poor outcome, Spine, 17, 42, 10.1097/00007632-199201000-00007
Sim, 1999, Statistical inference by confidence intervals: issues of interpretation and utilization, Phys Ther, 79, 186, 10.1093/ptj/79.2.186
Flack, 1988, Sample size determinations for the two rater kappa statistic, Psychometrika, 53, 321, 10.1007/BF02294215
Donner, 1992, A goodness-of-fit approach to inference procedures for the kappa statistic: confidence interval construction, significance-testing and sample size estimation, Stat Med, 11, 1511, 10.1002/sim.4780111109