The calibrated model-based concordance improved assessment of discriminative ability in patient clusters of limited sample size
Tóm tắt
Discriminative ability is an important aspect of prediction model performance, but challenging to assess in clustered (e.g., multicenter) data. Concordance (c)-indexes may be too extreme within small clusters. We aimed to define a new approach for the assessment of discriminative ability in clustered data. We assessed discriminative ability of a prediction model for the binary outcome mortality after traumatic brain injury within centers of the CRASH trial. With multilevel logistic regression analysis, we estimated cluster-specific calibration slopes which we used to obtain the recently proposed calibrated model-based concordance (c-mbc) within each cluster. We compared the c-mbc with the naïve c-index in centers of the CRASH trial and in simulations of clusters with varying calibration slopes. The c-mbc was less extreme in distribution than the c-index in 19 European centers (internal validation; n = 1716) and 36 non-European centers (external validation; n = 3135) of the CRASH trial. In simulations, the c-mbc was biased but less variable than the naïve c-index, resulting in lower root mean squared errors. The c-mbc, based on multilevel regression analysis of the calibration slope, is an attractive alternative to the c-index as a measure of discriminative ability in multicenter studies with patient clusters of limited sample size.
Tài liệu tham khảo
Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med. 1999;130:515–24.
Altman DG, Royston P. What do we mean by validating a prognostic model? Stat Med. 2000;19:453–73.
Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21:128–38.
Harrell FE Jr, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA. 1982;247:2543–6.
Gönen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005;92:965–70.
van Klaveren D, Gonen M, Steyerberg EW, Vergouwe Y. A new concordance measure for risk prediction models in external validation settings. Stat Med. 2016;35:4136.
Vergouwe Y, Moons KG, Steyerberg EW. External validity of risk models: use of benchmark values to disentangle a case-mix effect from incorrect coefficients. Am J Epidemiol. 2010;172:971–80.
van Klaveren D, Steyerberg EW, Perel P, Vergouwe Y. Assessing discriminative ability of risk models in clustered data. BMC Med Res Methodol. 2014;14:5.
Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press; 2007.
Greenland S. Principles of multilevel modelling. Int J Epidemiol. 2000;29:158–67.
Austin PC, van Klaveren D, Vergouwe Y, Nieboer D, Lee DS, Steyerberg EW. Validation of prediction models: examining temporal and geographic stability of baseline risk and estimated covariate effects. Diagn Prognostic Res. 2017;1:12.
Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York: Springer; 2009.
Bates D. Lme4: mixed-effects modeling with R Springer; 2010.
Robinson GK. That BLUP is a good thing: the estimation of random effects; 1991. p. 15–32.
Edwards P, Arango M, Balica L, Cottingham R, El-Sayed H, Farrell B, et al. Final results of MRC CRASH, a randomised placebo-controlled trial of intravenous corticosteroid in adults with head injury-outcomes at 6 months. Lancet. 2005;365:1957–9.
Teasdale G, Jennett B. Assessment of coma and impaired consciousness. A practical scale. Lancet. 1974;2:81–4.
Jennett B, Bond M. Assessment of outcome after severe brain damage. Lancet. 1975;1:480–4.
MRC CRASH Trial Collaborators, Perel P, Arango M, Clayton T, Edwards P, Komolafe E, et al. Predicting outcome after traumatic brain injury: practical prognostic models based on large cohort of international patients. BMJ. 2008;336:425–9.
Steyerberg EW, Mushkudiani N, Perel P, Butcher I, Lu J, McHugh GS, et al. Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS Med. 2008;5:e165.
Team RC. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2016.
Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. 2015. 2015;67:48.
Efron B. Biased versus unbiased estimation. Adv Math. 1975;16:259–77.
Austin PC, van Klaveren D, Vergouwe Y, Nieboer D, Lee DS, Steyerberg EW. Geographic and temporal validity of prediction models: different approaches were useful to examine model performance. J Clin Epidemiol. 2016;79:76–85.
Riley RD, Ensor J, Snell KIE, Debray TPA, Altman DG, Moons KGM, et al. External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016;353:i3140.
Riley RD, Higgins JP, Deeks JJ. Interpretation of random effects meta-analyses. BMJ. 2011;342:d549.
Therneau TM, Grambsch PM. Modeling survival data: extending the cox model. Verlag: Springer; 2000.
Korn EL, Simon R. Measures of explained variation for survival data. Stat Med. 1990;9:487–503.