Overfitting in prediction models – Is it a problem only in high dimensions?

Contemporary Clinical Trials - Tập 36 - Trang 636-641 - 2013
Jyothi Subramanian1, Richard Simon2
1Emmes Corporation, USA
2Biometric Research Branch, National Cancer Institute, USA

Tài liệu tham khảo

Simon, 2012, Clinical trials for predictive medicine, Stat Med, 31, 3031, 10.1002/sim.5401 Simon, 2003, Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification, J Natl Cancer Inst, 95, 14, 10.1093/jnci/95.1.14 Ambroise, 2002, Selection bias in gene extraction on the basis of microarray gene-expression data, Proc Natl Acad Sci USA, 99, 6562, 10.1073/pnas.102102699 Harrell, 1996, Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors, Stat Med, 15, 361, 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4 Hastie, 2009 Concato, 1993, The risk of determining risk with multivariable models, Ann Intern Med, 118, 201, 10.7326/0003-4819-118-3-199302010-00009 Babyak, 2004, What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models, Psychosom Med, 66, 411 Concato, 1995, Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy, J Clin Epidemiol, 48, 1495, 10.1016/0895-4356(95)00510-2 Peduzzi, 1995, Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates, J Clin Epidemiol, 48, 1503, 10.1016/0895-4356(95)00048-8 Simon, 2004 Dudoit, 2002, Comparison of discrimination methods for the classification of tumors using gene-expression data, J Am Stat Assoc, 97, 77, 10.1198/016214502753479248 Tibshirani, 2002, Diagnosis of multiple cancer types by shrunken centroids of gene expression, Proc Natl Acad Sci U S A, 99, 6567, 10.1073/pnas.082099299 Core Team, 2012, R: A language and environment for statistical computing Maechler Hastie