Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research

European Journal of Operational Research - Tập 247 Số 1 - Trang 124-136 - 2015
Stefan Lessmann1, Bart Baesens2,3, Hsin‐Vonn Seow4, Lyn C. Thomas3
1School of Business and Economics, Humboldt University of Berlin, Unter den Linden 6, 10099, Berlin, Germany.
2Department of Decision Sciences & Information Management, Catholic University of Leuven, Naamsestraat 69, B-3000 Leuven, Belgium
3School of Management, University of Southampton, Highfield Southampton, SO17 1BJ, United Kingdom
4Nottingham University Business School, University of Nottingham-Malaysia Campus, Jalan Broga, 43500 Semenyih, Selangor Darul Ehsan, Malaysia

Tóm tắt

Từ khóa


Tài liệu tham khảo

Abdou, 2009, Genetic programming for credit scoring: The case of Egyptian public sector banks, Expert Systems with Applications, 36, 11402, 10.1016/j.eswa.2009.01.076

Abdou, 2008, Neural nets versus conventional techniques in credit scoring in Egyptian banking, Expert Systems with Applications, 35, 1275, 10.1016/j.eswa.2007.08.030

Abellán, 2014, Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring, Expert Systems with Applications, 41, 3825, 10.1016/j.eswa.2013.12.003

Akkoc, 2012, An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish credit card data, European Journal of Operational Research, 222, 168, 10.1016/j.ejor.2012.04.009

Andreeva, 2006, European generic scoring models using survival analysis, Journal of the Operational Research Society, 57, 1180, 10.1057/palgrave.jors.2602091

Atish, 2004, Evaluating and tuning predictive data mining models using receiver operating characteristic curves, Journal of Management Information Systems, 21, 249, 10.1080/07421222.2004.11045815

Baesens, 2003, Benchmarking state-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society, 54, 627, 10.1057/palgrave.jors.2601545

Bellotti, 2009, Support vector machines for credit scoring and discovery of significant features, Expert Systems with Applications, 36, 3302, 10.1016/j.eswa.2008.01.005

Breiman, 1996, Bagging predictors, Machine Learning, 24, 123, 10.1007/BF00058655

Breiman, 2001, Random forests, Machine Learning, 45, 5, 10.1023/A:1010933404324

Brown, 2012, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, 39, 3446, 10.1016/j.eswa.2011.09.033

Calabrese, 2014, Downturn loss given default: Mixture distribution estimation, European Journal of Operational Research, 237, 271, 10.1016/j.ejor.2014.01.043

Caruana, 2006, Getting the most out of ensemble selection, 828

Chen, 2009, Mining the customer credit using hybrid support vector machine technique, Expert Systems with Applications, 36, 7611, 10.1016/j.eswa.2008.09.054

Crone, 2006, The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing, European Journal of Operational Research, 173, 781, 10.1016/j.ejor.2005.07.023

Crook, 2007, Recent developments in consumer credit risk assessment, European Journal of Operational Research, 183, 1447, 10.1016/j.ejor.2006.09.100

Demšar, 2006, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1

Dietterich, 1998, Approximate statistical tests for comparing supervised classification learning, Neural Computation, 10, 1895, 10.1162/089976698300017197

Dirick, 2015, An Akaike information criterion for multiple event mixture cure models, European Journal of Operational Research, 241, 449, 10.1016/j.ejor.2014.08.038

Fawcett, 2006, An introduction to ROC analysis, Pattern Recognition Letters, 27, 861, 10.1016/j.patrec.2005.10.010

Finlay, 2009, Credit scoring for profitability objectives, European Journal of Operational Research, 202, 528, 10.1016/j.ejor.2009.05.025

Finlay, 2011, Multiple classifier architectures and their application to credit risk assessment, European Journal of Operational Research, 210, 368, 10.1016/j.ejor.2010.09.029

Freund, 1996, Experiments with a new boosting algorithm, 148

Friedman, 2002, Stochastic gradient boosting, Computational Statistics & Data Analysis, 38, 367, 10.1016/S0167-9473(01)00065-2

García, 2010, Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power, Information Sciences, 180, 2044, 10.1016/j.ins.2009.12.010

García, 2008, An extension on “Statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons, Journal of Machine Learning Research, 9, 2677

Gong, 2012, A Kolmogorov–Smirnov statistic based segmentation approach to learning from imbalanced datasets: With application in property refinance prediction, Expert Systems with Applications, 39, 6192, 10.1016/j.eswa.2011.12.011

Guang-Bin, 2006, Universal approximation using incremental constructive feedforward networks with random hidden nodes, IEEE Transactions on Neural Networks, 17, 879, 10.1109/TNN.2006.875977

Hall, 2000, Correlation-based feature selection for discrete and numeric class machine learning, 359

Hand, 2005, Good practice in retail credit scorecard assessment, Journal of the Operational Research Society, 56, 1109, 10.1057/palgrave.jors.2601932

Hand, 2006, Classifier technology and the illusion of progress, Statistical Science, 21, 1, 10.1214/088342306000000060

Hand, 2009, Measuring classifier performance: A coherent alternative to the area under the ROC curve, Machine Learning, 77, 103, 10.1007/s10994-009-5119-5

Hand, 2013, When is the area under the receiver operating characteristic curve an appropriate measure of classifier performance?, Pattern Recognition Letters, 34, 492, 10.1016/j.patrec.2012.12.004

Hand, 1997, Statistical classification models in consumer credit scoring: A review, Journal of the Royal Statistical Society: Series A (General), 160, 523, 10.1111/j.1467-985X.1997.00078.x

Hand, 2005, Optimal bipartite scorecards, Expert Systems with Applications, 29, 684, 10.1016/j.eswa.2005.04.032

He, 2004, Classifications of credit cardholder behavior by using multiple criteria non-linear programming, vol. 3327, 154

Hens, 2012, Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method, Expert Systems with Applications, 39, 6774, 10.1016/j.eswa.2011.12.057

Hernández-Orallo, 2011, Brier curves: A new cost-based visualisation of classifier performance, 585

Hofer, 2015, Adapting a classification rule to local and global shift when only unlabelled data are available, European Journal of Operational Research, 243, 177, 10.1016/j.ejor.2014.11.022

Hsieh, 2010, A data driven ensemble classifier for credit scoring analysis, Expert Systems with Applications, 37, 534, 10.1016/j.eswa.2009.05.059

Huang, 2007, Credit scoring with a data mining approach based on support vector machines, Expert Systems with Applications, 33, 847, 10.1016/j.eswa.2006.07.007

Huang, 2006, Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem, Nonlinear Analysis: Real World Applications, 7, 720, 10.1016/j.nonrwa.2005.04.006

Ko, 2008, From dynamic classifier selection to dynamic ensemble selection, Pattern Recognition, 41, 1735, 10.1016/j.patcog.2007.10.015

Kruppa, 2013, Consumer credit risk: Individual probability estimates using machine learning, Expert Systems with Applications, 40, 5125, 10.1016/j.eswa.2013.03.019

Kumar, 2007, Bankruptcy prediction in banks and firms via statistical and intelligent techniques—A review, European Journal of Operational Research, 180, 1, 10.1016/j.ejor.2006.08.043

Lee, 2005, A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines, Expert Systems with Applications, 28, 743, 10.1016/j.eswa.2004.12.031

Lee, 2006, Mining the customer credit using classification and regression tree and multivariate adaptive regression splines, Computational Statistics & Data Analysis, 50, 1113, 10.1016/j.csda.2004.11.006

Li, 2011, An evolution strategy-based multiple kernels multi-criteria programming approach: The case of credit decision making, Decision Support Systems, 51, 292, 10.1016/j.dss.2010.11.022

Li, 2006, The evaluation of consumer loans using support vector machines, Expert Systems with Applications, 30, 772, 10.1016/j.eswa.2005.07.041

Li, 2012, Relevance vector machine based infinite decision agent ensemble learning for credit risk analysis, Expert Systems with Applications, 39, 4947, 10.1016/j.eswa.2011.10.022

Lichman, 2013

Liu, 2015, Identifying future defaulters: A hierarchical Bayesian method, European Journal of Operational Research, 241, 202, 10.1016/j.ejor.2014.08.008

Malhotra, 2003, Evaluating consumer loans using neural networks, Omega, 31, 83, 10.1016/S0305-0483(03)00016-1

Marqués, 2012, Exploring the behaviour of base classifiers in credit scoring ensembles, Expert Systems with Applications, 39, 10244, 10.1016/j.eswa.2012.02.092

Marqués, 2012, Two-level classifier ensembles for credit risk assessment, Expert Systems with Applications, 39, 10916, 10.1016/j.eswa.2012.03.033

Martens, 2010, Credit rating prediction using Ant Colony Optimization, Journal of the Operational Research Society, 61, 561, 10.1057/jors.2008.164

Nanni, 2009, An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring, Expert Systems with Applications, 36, 3028, 10.1016/j.eswa.2008.01.018

Ong, 2005, Building credit scoring models using genetic programming, Expert Systems with Applications, 29, 41, 10.1016/j.eswa.2005.01.003

Paleologo, 2010, Subagging for credit scoring models, European Journal of Operational Research, 201, 490, 10.1016/j.ejor.2009.03.008

Partalas, 2009, Pruning an ensemble of classifiers via reinforcement learning, Neurocomputing, 72, 1900, 10.1016/j.neucom.2008.06.007

Partalas, 2010, An ensemble uncertainty aware measure for directed hill climbing ensemble pruning, Machine Learning, 81, 257, 10.1007/s10994-010-5172-0

Ping, 2011, Neighborhood rough set and SVM based hybrid credit scoring classifier, Expert Systems with Applications, 38, 11300, 10.1016/j.eswa.2011.02.179

Platt, 2000, Probabilities for support vector machines, 61

Pundir, 2012, A novel concept of partial lorenz curve and partial gini index, International Journal of Engineering, Science and Innovative Technology, 1, 296

Rodriguez, 2006, Rotation forest: A new classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1619, 10.1109/TPAMI.2006.211

Sinha, 2008, Incorporating domain knowledge into data mining classifiers: An application in indirect lending, Decision Support Systems, 46, 287, 10.1016/j.dss.2008.06.013

So, 2011, Modelling the profitability of credit cards by Markov decision processes, European Journal of Operational Research, 212, 123, 10.1016/j.ejor.2011.01.023

Sohn, 2014, Updating a credit-scoring model based on new attributes without realization of actual data, European Journal of Operational Research, 234, 119, 10.1016/j.ejor.2013.02.030

Šušteršič, 2009, Consumer credit scoring models with limited data, Expert Systems with Applications, 36, 4736, 10.1016/j.eswa.2008.06.016

Thomas, 2010, Consumer finance: Challenges for operational research, Journal of the Operational Research Society, 61, 41, 10.1057/jors.2009.104

Thomas, 2002

Tong, 2012, Mixture cure models in credit scoring: if and when borrowers default, European Journal of Operational Research, 218, 132, 10.1016/j.ejor.2011.10.007

Tsai, 2014, Combining cluster analysis with classifier ensembles to predict financial distress, Information Fusion, 16, 46, 10.1016/j.inffus.2011.12.001

Tsai, 2008, Using neural network ensembles for bankruptcy prediction and credit scoring, Expert Systems with Applications, 34, 2639, 10.1016/j.eswa.2007.05.019

Tsai, 2009, The consumer loan default predicting model—An application of DEA-DA and neural network, Expert Systems with Applications, 36, 11682, 10.1016/j.eswa.2009.03.009

Twala, 2010, Multiple classifier application to credit risk assessment, Expert Systems with Applications, 37, 3326, 10.1016/j.eswa.2009.10.018

Verbeke, 2012, New insights into churn prediction in the telecommunication sector: A profit driven data mining approach, European Journal of Operational Research, 218, 211, 10.1016/j.ejor.2011.09.031

Verbraken, 2014, Development and application of consumer credit scoring models using profit-based classification measures, European Journal of Operational Research, 238, 505, 10.1016/j.ejor.2014.04.001

Viaene, 2004, Cost-sensitive learning and decision making revisited, European Journal of Operational Research, 166, 212, 10.1016/j.ejor.2004.03.031

Wang, 2011, A comparative assessment of ensemble learning for credit scoring, Expert Systems with Applications, 38, 223, 10.1016/j.eswa.2010.06.048

West, 2005, Neural network ensemble strategies for financial decision applications, Computers & Operations Research, 32, 2543, 10.1016/j.cor.2004.03.017

Woloszynski, 2011, A probabilistic model of classifier competence for dynamic ensemble selection, Pattern Recognition, 44, 2656, 10.1016/j.patcog.2011.03.020

Xiao, 2006, A comparative study of data mining methods in consumer loans credit scoring management, Journal of Systems Science and Systems Engineering, 15, 419, 10.1007/s11518-006-5023-5

Xu, 2009, Credit scoring algorithm based on link analysis ranking with support vector machine, Expert Systems with Applications, 36, 2625, 10.1016/j.eswa.2008.01.024

Yang, 2007, Adaptive credit scoring with kernel learning methods, European Journal of Operational Research, 183, 1521, 10.1016/j.ejor.2006.10.066

Yao, 2015, Support vector regression for loss given default modelling, European Journal of Operational Research, 240, 528, 10.1016/j.ejor.2014.06.043

Yap, 2011, Using data mining to improve assessment of credit worthiness via credit scoring models, Expert Systems with Applications, 38, 13274, 10.1016/j.eswa.2011.04.147

Yu, 2008, Credit risk assessment with a multistage neural network ensemble learning approach, Expert Systems with Applications, 34, 1434, 10.1016/j.eswa.2007.01.009

Yu, 2009, An intelligent-agent-based fuzzy group decision making model for financial multicriteria decision support: The case of credit scoring, European Journal of Operational Research, 195, 942, 10.1016/j.ejor.2007.11.025

Yu, 2011, Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection, Expert Systems with Applications, 38, 15392, 10.1016/j.eswa.2011.06.023

Yu, 2010, Support vector machine based multiagent ensemble learning for credit risk evaluation, Expert Systems with Applications, 37, 1351, 10.1016/j.eswa.2009.06.083

Zhang, 2010, Vertical bagging decision trees model for credit scoring, Expert Systems with Applications, 37, 7838, 10.1016/j.eswa.2010.04.054

Zhang, 2009, Several multi-criteria programming methods for classification, Computers & Operations Research, 36, 823, 10.1016/j.cor.2007.11.001

Zhou, 2010, Least Squares Support Vector Machines ensemble models for credit scoring, Expert Systems with Applications, 37, 127, 10.1016/j.eswa.2009.05.024