Consumer credit risk: Individual probability estimates using machine learning
Tóm tắt
Từ khóa
Tài liệu tham khảo
Agresti, 2005, Simple improved confidence intervals for comparing matched proportions, Statistics in Medicine, 24, 729, 10.1002/sim.1781
Arminger, 1997, Analyzing credit risk data: A comparison of logistic discrimination, classification tree analysis, and feedforward networks, Computational Statistics, 12, 293
Baesens, 2003, Benchmarking state-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society, 54, 627, 10.1057/palgrave.jors.2601545
Banerjee, 2012, Identifying representative trees from ensembles, Statistics in Medicine, 31, 1601, 10.1002/sim.4492
Bauer, 1999, An empirical comparison of voting classification algorithms: Bagging, boosting, and variants, Machine Learning, 36, 105, 10.1023/A:1007515423169
Biau, 2012, Analysis of a random forests model, Journal of Machine Learning Research, 13, 1063
Biau, 2010, On the rate of convergence of the bagged nearest neighbor estimate, Journal of Machine Learning Research, 11, 687
Biau, 2010, Rates of convergence of the functional k-nearest neighbor estimate, IEEE Transactions on Information Theory, 56, 2034, 10.1109/TIT.2010.2040857
Biau, 2010, On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification, Journal of Multivariate Analysis, 101, 2499, 10.1016/j.jmva.2010.06.019
Biau, 2008, Consistency of random forests and other averaging classifiers, Journal of Machine Learning Research, 9, 2015
Bonne, 2000
Bradley, 2008, Sampling uncertainty and confidence intervals for the Brier score and Brier skill score, Weather Forecast, 23, 992, 10.1175/2007WAF2007049.1
Breiman, 1984
Brigham, 1992
Brown, 2012, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Application, 39, 3446, 10.1016/j.eswa.2011.09.033
Buntine, W. L. (1992). A theory of learning classification rules. Ph.D. University of Technology, Sydney. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.5614.
Crook, 2007, Recent developments in consumer credit risk assessment, European Journal of Operational Research, 183, 1447, 10.1016/j.ejor.2006.09.100
Delong, 1988, Comparing the areas under 2 or more correlated receiver operating characteristic curves – A nonparametric approach, Biometrics, 44, 837, 10.2307/2531595
Devroye, 1994, On the strong universal consistency of nearest neighbor regression function estimates, Annals of Statistics, 22, 1371, 10.1214/aos/1176325633
Devroye, 1996
Díaz-Uriarte, 2006, Gene selection and classification of microarray data using random forest, BMC Bioinformatics, 7, 3, 10.1186/1471-2105-7-3
Gneiting, 2007, Strictly proper scoring rules, prediction, and estimation, Journal of American Statistics Association, 102, 359, 10.1198/016214506000001437
Hand, 1998
Hand, 1997, Statistical classification methods in consumer credit scoring: a review, Journal of the Royal Statistical Society Series A. Statistics in Society, 160, 523, 10.1111/j.1467-985X.1997.00078.x
Ikeda, 2001, Application of resampling techniques to the statistical analysis of the Brier score, Methods of Information in Medicine, 40, 259, 10.1055/s-0038-1634163
Johnson, 1986
König, I. R., Malley, J. D., Pajevic, S., Weimar, C., Diener, H.-C., & Ziegler, A., on behalf of the German Stroke Study Collaborators (2008). Patient-centered yes/no prognosis using learning machines. International Journal of Data Mining and Bioinformatics, 2, 289–341. http://dx.doi.org/10.1504/IJDMB.2008.022149.
Kruppa, 2012, Risk estimation and risk prediction using machine-learning methods, Human Genetics, 131, 1639, 10.1007/s00439-012-1194-y
Liu, 2011, Soft or hard classification? Large margin unified machines, Journal of American Statistics Association, 106, 166, 10.1198/jasa.2011.tm10319
Malley, 2012, Probability machines: consistent probability estimation using nonparametric learning machines, Methods of Information in Medicine, 51, 74, 10.3414/ME00-01-0052
Newcombe, 1998, Improved confidence intervals for the difference between binomial proportions based on paired data, Statistics in Medicine, 17, 2635, 10.1002/(SICI)1097-0258(19981130)17:22<2635::AID-SIM954>3.0.CO;2-C
Newcombe, 1998, Two-sided confidence intervals for the single proportion: comparison of seven methods, Statistics in Medicine, 17, 857, 10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E
Nicodemus, 2010, The behaviour of random forest permutation-based variable importance measures under predictor correlation, BMC Bioinformatics, 11, 110, 10.1186/1471-2105-11-110
Pepe, 2008, Gauging the performance of SNPs, biomarkers, and clinical factors for predicting risk of breast cancer, Journal of the National Cancer Institute, 100, 978, 10.1093/jnci/djn215
Provost, 2003, Tree induction for probability-based ranking, Machine Learning, 52, 199, 10.1023/A:1024099825458
Provost, 1998, The case against accuracy estimation for comparing induction algorithms, 445
Schwarz, 2008
Schwarz, 2010, On safari to random jungle: a fast implementation of random forests for high-dimensional data, Bioinformatics, 26, 1752, 10.1093/bioinformatics/btq257
Stanski, H. R., Wilson, L. J., & Burrows, W. R. (1989). Survey of common verification methods in meteorology. In World meteorological organization.
Tango, 2000, Confidence intervals for differences in correlated binary proportions, Statistics in Medicine, 19, 133, 10.1002/(SICI)1097-0258(20000115)19:1<133::AID-SIM373>3.0.CO;2-M
Thomas, 2002
Verstraeten, 2005, The impact of sample bias on consumer credit scoring performance and profitability, Journal of the Operational Research Society, 56, 981, 10.1057/palgrave.jors.2601920