Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Tiêu chí chọn mô hình dựa trên thống kê đồng thuận theo phương pháp kiểm tra chéo

Computational Statistics - Tập 33 - Trang 595-621 - 2017

Patrick Ten Eyck¹, Joseph E. Cavanaugh²

¹Institute for Clinical and Translational Science, The University of Iowa, Iowa City, USA

²Department of Biostatistics, The University of Iowa, Iowa City, USA

Tóm tắt

Trong khuôn khổ hồi quy logistic, chúng tôi trình bày sự phát triển và điều tra ba tiêu chí chọn mô hình dựa trên các tương đương kiểm tra chéo của các thống kê c truyền thống và điều chỉnh. Những tiêu chí này được thiết kế để ước lượng ba biện pháp tương ứng về lỗi dự đoán: lỗi dự đoán do mô hình không chính xác, lỗi dự đoán của mẫu điều chỉnh, và tổng lỗi dự đoán. Chúng tôi nhằm mục đích chứng minh rằng các ước lượng này hoạt động như những tiêu chí chọn mô hình phù hợp, giúp xác định một mô hình cân bằng tốt giữa độ chính xác và tính tiết kiệm, đồng thời đảm bảo khả năng tổng quát. Chúng tôi xem xét các thuộc tính của các tiêu chí chọn lựa thông qua một nghiên cứu mô phỏng mở rộng được thiết kế như một thí nghiệm tách bạch. Sau đó, chúng tôi áp dụng những biện pháp này trong một ứng dụng thực tiễn dựa trên việc mô hình hóa sự xuất hiện của bệnh tim mạch.

Từ khóa

#tiêu chí chọn mô hình #hồi quy logistic #lỗi dự đoán #thống kê c #mô phỏng.

Tài liệu tham khảo

Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) 2nd international symposium on information theory. Akademia Kiado, Budapest, pp 267–281 Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC–19:716–723 Allen DM (1974) The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16:125–127 Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79 Bengtsson T, Cavanaugh JE (2006) An improved Akaike information criterion for state-space model selection. Comput Stat Data Anal 50:2635–2654 Bozdogan H (1987) Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52:345–370 Cavanaugh JE (1999) A large-sample model selection criterion based on Kullback’s symmetric divergence. Stat Probab Lett 44:333–344 Cavanaugh JE, Shumway RH (1997) A bootstrap variant of AIC for state-space model selection. Stat Sin 7:473–496 Cook NR (2007) Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation 115:928–935 Davies SL, Neath AA, Cavanaugh JE (2005) Cross validation model selection criteria for linear regression based on the Kullback–Leibler discrepancy. Stat Methodol 2:249–266 Efron B (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. J Am Stat Assoc 78:316–331 Efron B (1986) How biased is the apparent error rate of a prediction rule? J Am Stat Assoc 81:461–470 Golub GH, Heath M, Wahba G (1979) Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21:215–223 Gonen M, Heller G (2005) Concordance probability and discriminatory power in proportional hazards regression. Biometrika 92:965–970 Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36 Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning, 2nd edn. Springer, New York Heagerty PJ, Zheng Y (2005) Survival model predictive accuracy and ROC curves. Biometrics 61:92–105 Hilden J, Habbema JD, Bjerregaard B (1978) The measurement of performance in probabilistic diagnosis. II. Trustworthiness of the exact values of the diagnostic probabilities. Methods Inf Med 17:227–237 Hosmer DW, Lemeshow S (1980) A goodness-of-fit test for the multiple logistic regression model. Commun Stat A10:1043–1069 Hurvich CM, Tsai CL (1989) Regression and time series model selection in small samples. Biometrika 76:297–307 Hurvich CM, Shumway RH, Tsai CL (1990) Improved estimators of Kullback–Leibler information for autoregressive model selection in small samples. Biometrika 77:709–719 Ishiguro M, Sakamoto Y, Kitagawa G (1997) Bootstrapping log likelihood and EIC, an extension of AIC. Ann Inst Stat Math 49:411–434 Kullback S (1968) Information theory and statistics. Dover, New York Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86 Lemeshow S, Hosmer DW (1982) A review of goodness of fit statistics for use in the development of logistic regression models. Am J Epidemiol 115:92–106 Linhart H, Zucchini W (1986) Model selection. Wiley, New York Mallows CL (1973) Some comments on \(C_p\). Technometrics 15:661–675 Metz CE (1986) ROC methodology in radiologic imaging. Investig Radiol 21:720–733 Metz CE (1989) Some practical issues of experimental design and data analysis in radiologic ROC studies. Investig Radiol 24:234–245 Pan W (2001) Akaike’s information criterion in generalized estimating equations. Biometrics 57:120–125 Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS (2008) Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 27:157–172 Royston P, Altman DG (2010) Visualizing and assessing discrimination in the logistic regression model. Stat Med 29:2508–2520 Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464 Shao J (1993) Linear model selection by cross-validation. J Am Stat Assoc 88:486–495 Shibata R (1980) Asymptotically efficient selection of the order of the model for estimating parameters of a linear process. Ann Stat 8:147–164 Shibata R (1981) An optimal selection of regression variables. Biometrika 68:45–54 Shibata R (1997) Bootstrap estimate of Kullback–Leibler information for model selection. Stat Sin 7:375–394 Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW (2010) Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology 21:128–138 Stone M (1977) An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. J R Stat Soc Ser B 39:44–47 Sugiura N (1978) Further analysis of the data by Akaike’s information criterion and the finite corrections. Commun Stat A7:13–26 Takeuchi K (1976) Distribution of information statistics and criteria for adequacy of models. Math Sci 153:12–18 (in Japanese) Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58:267–288 Ten Eyck P, Cavanaugh JE (2015) The adjusted concordance statistic. In: Karagrigoriou A, Oliveira T, Skiadas C (eds) Statistical, stochastic and data analysis methods and applications. ISAST, Athens, pp 143–156 Vieu P (1994) Choice of regressors in nonparametric estimation. Comput Stat Data Anal 17:575–594 Zhang P (1991) Variable selection in nonparametric regression with continuous covariates. Ann Stat 19:1869–1882 Zhou XH, Obuchowski NA, McClish DK (2002) Stat Methods Diagn Med. Wiley, New York Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA