Classifying the diagnosis of study participants in clinical trials: a structured and efficient approach
Tóm tắt
A challenge in imaging research is a diagnostic classification of study participants. We hypothesised that a structured approach would be efficient and that classification by medical students, residents, and an expert panel whenever necessary would be as valid as classification of all patients by experts. OPTIMACT is a randomised trial designed to evaluate the effectiveness of replacing chest x-ray for ultra-low-dose chest computed tomography (CT) at the emergency department. We developed a handbook with diagnostic guidelines and randomly selected 240 cases from 2,418 participants enrolled in OPTIMACT. Each case was independently classified by two medical students and, if they disagreed, by the students and a resident in a consensus meeting. Cases without consensus and cases classified as complex were assessed by a panel of medical specialists. To evaluate the validity, 60 randomly selected cases not referred to the panel by the students and the residents were reassessed by the specialists. Overall, the students and, if necessary, residents were able to assign a diagnosis in 183 of the 240 cases (76% concordance; 95% confidence interval [CI] 71–82%). We observed agreement between students and residents versus medical specialists in 50/60 cases (83% concordance; 95% CI 74–93%). A structured approach in which study participants are assigned diagnostic labels by assessors with increasing levels of medical experience was an efficient and valid classification method, limiting the workload for medical specialists. We presented a viable option for classifying study participants in large-scale imaging trials (Netherlands National Trial Register number NTR6163).
Tài liệu tham khảo
Reitsma JB, Rutjes AW, Khan KS, Coomarasamy A, Bossuyt PM (2009) A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. J Clin Epidemiol 62:797–806. https://doi.org/10.1016/j.jclinepi.2009.02.005
Rutjes AW, Reitsma JB, Coomarasamy A, Khan KS, Bossuyt PM (2007) Evaluation of diagnostic tests when there is no gold standard. A review of methods. Health Technol Assess 11:iii, ix-51. doi: https://doi.org/10.3310/hta11500
Bertens LC, Broekhuizen BD, Naaktgeboren CA et al (2013) Use of expert panels to define the reference standard in diagnostic research: a systematic review of published methods and reporting. PLoS Med 10:e1001531. https://doi.org/10.1371/journal.pmed.1001531
van den Berk IAH, Kanglie MMNP, van Engelen TSR et al (2018) OPTimal IMAging strategy in patients suspected of non-traumatic pulmonary disease at the emergency department: chest X-ray or ultra-low-dose CT (OPTIMACT)—a randomised controlled trial chest X-ray or ultra-low-dose CT at the ED: design and rationale. Diagn Progn Res 2:20. https://doi.org/10.1186/s41512-018-0038-1
Klein Klouwenberg PM, Ong DS, Bos LD et al (2013) Interobserver agreement of Centers for Disease Control and Prevention criteria for classifying infections in critically ill patients. Crit Care Med 41:2373–2378. https://doi.org/10.1097/CCM.0b013e3182923712
Twisk JWR (2017) Inleiding in de toegepaste biostatistiek [introduction to applied biostatistics]. Bohn Stafleu van Loghum, Houten
Fleiss JL, Cohen J, Everitt BS (1969) Large sample standard errors of kappa and weighted kappa. Psychol Bull 72:323–327 https://doi.org/10.1037/h0028106
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174 https://doi.org/10.2307/2529310
Landis JR, Koch GG (1977) An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics 33:363–374 doi: https://doi.org/10.2307/2529786
Lameris W, van Randen A, van Es HW et al (2009) Imaging strategies for detection of urgent conditions in patients with acute abdominal pain: diagnostic accuracy study. BMJ 338:b2431. https://doi.org/10.1136/bmj.b2431
Bankier AA, Levine D, Halpern EF, Kressel HY (2010) Consensus interpretation in imaging research: is there a better way? Radiology 257:14–17. https://doi.org/10.1148/radiol.10100252
Obuchowski NA, Zepp RC (1996) Simple steps for improving multiple-reader studies in radiology. AJR Am J Roentgenol 166:517–521. https://doi.org/10.2214/ajr.166.3.8623619
Copeland KT, Checkoway H, McMichael AJ, Holbrook RH (1977) Bias due to misclassification in the estimation of relative risk. Am J Epidemiol 105:488–495. https://doi.org/10.1093/oxfordjournals.aje.a112408
Jurek AM, Greenland S, Maldonado G, Church TR (2005) Proper interpretation of non-differential misclassification effects: expectations vs observations. Int J Epidemiol 34:680–687. https://doi.org/10.1093/ije/dyi060
Boyko EJ, Alderman BW, Baron AE (1988) Reference test errors bias the evaluation of diagnostic tests for ischemic heart disease. J Gen Intern Med 3:476–481. https://doi.org/10.1007/BF02595925