An experimental comparison of classification algorithms for imbalanced credit scoring data sets

Expert Systems with Applications - Tập 39 - Trang 3446-3453 - 2012
Iain Brown1, Christophe Mues1
1School of Management, University of Southampton Highfield, Southampton SO17 1BJ, UK

Tài liệu tham khảo

Altman, 1968, Financial ratios, discriminant analysis and the prediction of corporate bankruptcy, Journal of Finance, 23, 589, 10.1111/j.1540-6261.1968.tb00843.x Altman, 1994, Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian Experience), Journal of Banking & Finance, 18, 505, 10.1016/0378-4266(94)90007-8 Arminger, 1997, Analyzing credit risk data: A comparison of logistic discrimination, classification tree analysis, and feed forward networks, Computational Statistics, 12, 293 Baesens, 2003, Benchmarking state-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society, 54, 627, 10.1057/palgrave.jors.2601545 Batista, 2004, A study of the behavior of several methods for balancing machine learning training data, ACM SIGKDD Explorations Newsletter, 6, 20, 10.1145/1007730.1007735 Benjamin, N., Cathcart, A., & Ryan, K. (2006). Low default portfolios: A proposal for conservative estimation of default probabilities. Discussion Paper. Financial Services Authority. Bishop, 1995 Breiman, 2001, Random forests, Machine Learning, 45, 5, 10.1023/A:1010933404324 Chatterjee, 1970, A nonparametric approach to credit screening, Journal of the American Statistical Association, 65, 50, 10.1080/01621459.1970.10481068 Chawla, 2002, SMOTE: Synthetic Minority Over-sampling Technique, Journal of Artificial Intelligence Research, 16, 321, 10.1613/jair.953 DeLong, 1988, Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach, Biometrics, 44, 837, 10.2307/2531595 Demšar, 2006, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1 Desai, 1996, A comparison of neural networks and linear scoring models in the credit union environment, European Journal of Operational Research, 95, 24, 10.1016/0377-2217(95)00246-4 Friedman, 1940, A comparison of alternative tests of significance for the problem of m rankings, Annals of Mathematical Statistics, 11, 86, 10.1214/aoms/1177731944 Friedman, 2001, Greedy function approximation: A gradient boosting machine, Annals of Statistics, 29, 1189, 10.1214/aos/1013203451 Friedman, 2002, Stochastic gradient boosting, Computational Statistics & Data Analysis, 38, 367, 10.1016/S0167-9473(01)00065-2 Hastie, 2001 Henley, 1997, Construction of a k-nearest neighbour credit-scoring system, IMA Journal of Management Mathematics, 8, 305, 10.1093/imaman/8.4.305 Hosmer, 2000 Japkowicz, N. (2000). Learning from imbalanced data sets: A comparison of various strategies. In AAAI workshop on learning from imbalanced data sets (Vol. 6, pp. 10–15). Lessmann, 2008, Benchmarking classification models for software defect prediction: A proposed framework and novel findings, IEEE Transactions on Software Engineering, 34, 485, 10.1109/TSE.2008.35 Nemenyi, P. B. (1963). Distribution-free multiple comparisons. Ph.D. Thesis. Princeton University. Provost, 1999, Efficient progressive sampling Quinlan, 1993 Steenackers, 1989, A credit scoring model for personal loans, Insurance: Mathematics and Economics, 8, 31 Suykens, 2002 Van Der Burgt, 2007 Weiss, 2003, Learning when training data are costly: The effect of class distribution on tree induction, Journal of Artificial Intelligence Research, 19, 315, 10.1613/jair.1199 West, 2000, Neural network credit scoring models, Computers & Operations Research, 27, 1131, 10.1016/S0305-0548(99)00149-5 Wiginton, 1980, A note on the comparison of logit and discriminant models of consumer credit behavior, Journal of Financial and Quantitative Analysis, 15, 757, 10.2307/2330408 Witten, 2005 Yang, 2007, Adaptive credit scoring with kernel learning methods, European Journal of Operational Research, 183, 1521, 10.1016/j.ejor.2006.10.066 Yobas, 2000, Credit scoring using neural and evolutionary techniques, IMA Journal of Management Mathematics, 11, 111, 10.1093/imaman/11.2.111