Statistical and machine learning models in credit scoring: A systematic literature survey
Tóm tắt
Từ khóa
Tài liệu tham khảo
Thomas, 2002
Thomas, 2000, A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers, Int. J. Forecast., 16, 149, 10.1016/S0169-2070(00)00034-0
Siddiqi, 2005
Myung, 2003, Tutorial on maximum likelihood estimation, J. Math. Psych., 47, 90, 10.1016/S0022-2496(02)00028-7
Baesens, 2003, Using neural network rule extraction and decision tables for credit-risk evaluation, Manage. Sci., 49, 312, 10.1287/mnsc.49.3.312.12739
Alaka, 2018, Systematic review of bankruptcy prediction models: Towards a framework for tool selection, Expert Syst. Appl., 94, 164, 10.1016/j.eswa.2017.10.040
Schlosser, 2007, 1
Bellovary, 2007, A review of bankruptcy prediction studies: 1930 to present, J. Financial Educ., 33, 1
Abdou, 2011, Credit scoring, statistical techniques and evaluation criteria: A review of the literature, Int. J. Intell. Syst. Account. Financ. Manage., 18, 59, 10.1002/isaf.325
Lin, 2012, Machine learning in financial crisis prediction: A survey, IEEE Trans. Syst. Man Cybern. C, 42, 421, 10.1109/TSMCC.2011.2170420
Wang, 2015, A survey of applying machine learning techniques for credit rating: existing models and open issues, 122
Louzada, 2016, Classification methods applied to credit scoring: Systematic review and overall comparison, Surv. Oper. Res. Manag. Sci., 21, 117
Devi, 2018
Liang, 2015, The effect of feature selection on financial distress prediction, Knowl.-Based Syst., 73, 289, 10.1016/j.knosys.2014.10.010
Brown, 2012, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Syst. Appl., 39, 3446, 10.1016/j.eswa.2011.09.033
Bijak, 2012, Does segmentation always improve model performance in credit scoring?, Expert Syst. Appl., 39, 2433, 10.1016/j.eswa.2011.08.093
Chen, 2010, Combination of feature selection approaches with SVM in credit scoring, Expert Syst. Appl., 37, 4902, 10.1016/j.eswa.2009.12.025
W. Chen, L. Shi, Credit scoring with F-score based on support vector machine, in: Proceedings 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer, MEC, 2013, pp. 1512–1516.
Chen, 2017, The study of credit scoring model based on group lasso, Procedia Comput. Sci., 122, 677, 10.1016/j.procs.2017.11.423
Chi, 2012, A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model, Expert Syst. Appl., 39, 2650, 10.1016/j.eswa.2011.08.120
Back, 1996, Neural networks and genetic algorithms for bankruptcy predictions, Expert Syst. Appl., 11, 407, 10.1016/S0957-4174(96)00055-3
Oreski, 2014, Genetic algorithm-based heuristic for feature selection in credit risk assessment, Expert Syst. Appl., 41, 2052, 10.1016/j.eswa.2013.09.004
Song, 2017, Feature selection based on FDA and F-score for multi-class classification, Expert Syst. Appl., 81, 22, 10.1016/j.eswa.2017.02.049
Pawlak, 1997, Rough set approach to knowledge-based decision support, European J. Oper. Res., 99, 48, 10.1016/S0377-2217(96)00382-7
Wang, 2010, Rough set and tabu search based feature selection for credit scoring, Procedia Comput. Sci., 1, 2425, 10.1016/j.procs.2010.04.273
Zhang, 2016, A survey on rough set theory and its applications, CAAI Trans. Intell. Technol., 1, 323, 10.1016/j.trit.2016.11.001
Tsai, 2009, Feature selection in bankruptcy prediction, Knowl.-Based Syst., 22, 120, 10.1016/j.knosys.2008.08.002
Mitchell, 1996
Kozeny, 2015, Genetic algorithms for credit scoring: Alternative fitness function performance comparison, Expert Syst. Appl., 42, 2998, 10.1016/j.eswa.2014.11.028
Crepinsek, 2013, Exploration and exploitation in evolutionary algorithms: A survey, ACM Comput. Surv., 45, 35:1, 10.1145/2480741.2480752
Liu, 2009, To explore or to exploit: An entropy-driven approach for evolutionary algorithms, KES J., 13, 185, 10.3233/KES-2009-0184
Cadenas, 2013, Feature subset selection filter–wrapper based on low quality data, Expert Syst. Appl., 40, 6241, 10.1016/j.eswa.2013.05.051
Tibshirani, 2011, Regression shrinkage and selection via the lasso: a retrospective, J. R. Stat. Soc. Ser. B Stat. Methodol., 73, 273, 10.1111/j.1467-9868.2011.00771.x
Zheng, 2018
S. Sehgal, H. Singh, M. Agarwal, V. Bhasker, . Shantanu, Data analysis using principal component analysis, in: 2014 International Conference on Medical Imaging, M-Health and Emerging Communication Systems, MedCom, 2014, pp. 45–48.
Fisher, 1936, The use of multiple measurements in taxonomic problems, Ann. Eugen., 7, 179, 10.1111/j.1469-1809.1936.tb02137.x
Rao, 1948, The utilization of multiple measurements in problems of biological classification, J. R. Stat. Soc., 159, 10.1111/j.2517-6161.1948.tb00008.x
Duda, 2001
Reynolds, 2015, Gaussian mixture models, 827
Dempster, 1977, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B Stat. Methodol., 39, 1, 10.1111/j.2517-6161.1977.tb01600.x
Henley, 1996, A k-nearest-neighbour classifier for assessing consumer credit risk, J. R. Stat. Soc., 45, 77
Schölkopf, 2000, The kernel trick for distances, 283
Mitchell, 1997
Barboza, 2017, Machine learning models and bankruptcy prediction, Expert Syst. Appl., 83, 405, 10.1016/j.eswa.2017.04.006
Tsai, 2014, A comparative study of classifier ensembles for bankruptcy prediction, Appl. Soft Comput., 24, 977, 10.1016/j.asoc.2014.08.047
Tsai, 2008, Using neural network ensembles for bankruptcy prediction and credit scoring, Expert Syst. Appl., 34, 2639, 10.1016/j.eswa.2007.05.019
M.D. Odom, R. Sharda, A neural network model for bankruptcy prediction, in: 1990 IJCNN International Joint Conference on Neural Networks, vol. 2, 1990, pp. 163–168.
Freund, 1997, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. System Sci., 55, 119, 10.1006/jcss.1997.1504
Chen, 2016, XGBoost: A scalable tree boosting system, 785
Nobre, 2019, Combining principal component analysis, discrete wavelet transform and xgboost to trade in the financial markets, Expert Syst. Appl., 125, 10.1016/j.eswa.2019.01.083
Xia, 2017, A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring, Expert Syst. Appl., 78, 10.1016/j.eswa.2017.02.017
Yu, 2018, A DBN-based resampling SVM ensemble learning paradigm for credit classification with imbalanced data, Appl. Soft Comput., 69, 192, 10.1016/j.asoc.2018.04.049
Hinton, 2006, A fast learning algorithm for deep belief nets, Neural Comput., 18, 1527, 10.1162/neco.2006.18.7.1527
LeCun, 1989, Backpropagation applied to handwritten zip code recognition, Neural Comput., 1, 541, 10.1162/neco.1989.1.4.541
Seo, 2019, Hierarchical convolutional neural networks for fashion image classification, Expert Syst. Appl., 116, 328, 10.1016/j.eswa.2018.09.022
Lecun, 1995
Ting, 2019, Convolutional neural network improvement for breast cancer classification, Expert Syst. Appl., 120, 103, 10.1016/j.eswa.2018.11.008
Sezer, 2018, Algorithmic financial trading with deep convolutional neural networks: time series to image conversion approach, Appl. Soft Comput., 70, 525, 10.1016/j.asoc.2018.04.024
Zhao, 2017, Convolutional neural networks for time series classification, J. Syst. Eng. Electron., 28, 162, 10.21629/JSEE.2017.01.18
Chollet, 2017
Goodfellow, 2016
Bishop, 1995
Tomczak, 2015, Classification restricted Boltzmann machine for comprehensible credit scoring model, Expert Syst. Appl., 42, 1789, 10.1016/j.eswa.2014.10.016
Douzas, 2017, Self-organizing map oversampling (SOMO) for imbalanced data set learning, Expert Syst. Appl., 82, 40, 10.1016/j.eswa.2017.03.073
Chawla, 2002, SMOTE: Synthetic minority over-sampling technique, J. Artificial Intelligence Res., 16, 321, 10.1613/jair.953
Saia, 2018
Nobre, 2019, Combining principal component analysis, discrete wavelet transform and xgboost to trade in the financial markets, Expert Syst. Appl., 125, 10.1016/j.eswa.2019.01.083
Saia, 2016
Craven, 1995, Extracting tree-structured representations of trained networks, 24
Ribeiro, 2016, “Why should I trust you?”: Explaining the predictions of any classifier, CoRR, abs/1602.04938
Eisenbeis, 1978, Problems in applying discriminant analysis in credit scoring models, J. Bank. Financ., 2, 205, 10.1016/0378-4266(78)90012-2
John, 1995, Estimating continuous distributions in Bayesian classifiers, 338
Yu, 2011, Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection, Expert Syst. Appl., 38, 15392, 10.1016/j.eswa.2011.06.023
Y. Jiang, Credit scoring model based on the decision tree and the simulated annealing algorithm, in: 2009 WRI World Congress on Computer Science and Information Engineering, Vol. 4, 2009, pp. 18–22.
Setiono, 1997, Neurolinear: From neural networks to oblique decision rules, Neurocomputing, 17, 1, 10.1016/S0925-2312(97)00038-6
R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, N. Elhadad, Intelligible models for healthCare: Predicting pneumonia risk and hospital 30-day readmission, in: KDD ’15, 2015.
Lundberg, 2017, A unified approach to interpreting model predictions, CoRR, abs/1705.07874
Friedman, 2000, Greedy function approximation: A gradient boosting machine, Ann. Statist., 29, 1189, 10.1214/aos/1013203451
Luo, 2017, A deep learning approach for credit scoring using credit default swaps, Eng. Appl. Artif. Intell., 65, 465, 10.1016/j.engappai.2016.12.002
S. Ramasamy, K. Rajaraman, A hybrid meta-cognitive restricted Boltzmann machine classifier for credit scoring, in: TENCON 2017 - 2017 IEEE Region 10 Conference, 2017, pp. 2313–2318.
K. Tran, T. Duong, Q. Ho, Credit scoring model: A combination of genetic programming and deep learning, in: 2016 Future Technologies Conference, FTC, 2016, pp. 145–149.
S.H. Yeh, C.J. Wang, M.F. Tsai, Deep belief networks for predicting corporate defaults, in: 2015 24th Wireless and Optical Communication Conference, WOCC, 2015, pp. 159–163.
V. Neagoe, A. Ciotec, G. Cucu, Deep convolutional neural networks versus multilayer perceptron for financial prediction, in: 2018 International Conference on Communications, COMM, 2018, pp. 201–206.
Hamori, 2018, Ensemble learning or deep learning? Application to default risk analysis, J. Risk Financial Manag., 11, 10.3390/jrfm11010012
Shorten, 2019, A survey on image data augmentation for deep learning, J. Big Data, 6, 60, 10.1186/s40537-019-0197-0
Gómez-Ríos, 2018, Towards highly accurate coral texture images classification using deep convolutional neural networks and data augmentation, CoRR, abs/1804.00516
Kvamme, 2018, Predicting mortgage default using convolutional neural networks, Expert Syst. Appl., 102, 10.1016/j.eswa.2018.02.029
Krizhevsky, 2012, Imagenet classification with deep convolutional neural networks, Neural Inf. Process. Syst., 25
Perez, 2017, The effectiveness of data augmentation in image classification using deep learning, CoRR
Salamon, 2016, Deep convolutional neural networks and data augmentation for environmental sound classification, CoRR
Frid-Adar, 2018, Synthetic data augmentation using GAN for improved liver lesion classification, CoRR
B. Zhu, W. Yang, H. Wang, Y. Yuan, A hybrid deep learning model for consumer credit scoring, in: 2018 International Conference on Artificial Intelligence and Big Data, ICAIBD, 2018, pp. 205–208.
M.F. Kiani, F. Mahmoudi, A new hybrid method for credit scoring based on clustering and support vector machine (ClsSVM), in: 2010 2nd IEEE International Conference on Information and Financial Engineering, 2010, pp. 585–589.
Zhang, 2010, Vertical bagging decision trees model for credit scoring, Expert Syst. Appl., 37, 7838, 10.1016/j.eswa.2010.04.054
Farquad, 2011, Credit scoring using PCA-SVM hybrid model, 249
Ping, 2011, Neighborhood rough set and SVM based hybrid credit scoring classifier, Expert Syst. Appl., 38, 11300, 10.1016/j.eswa.2011.02.179
Wang, 2012, Rough set and scatter search metaheuristic based feature selection for credit scoring, Expert Syst. Appl., 39, 6123, 10.1016/j.eswa.2011.11.011
Han, 2013, Orthogonal support vector machine for credit scoring, Eng. Appl. Artif. Intell., 26, 848, 10.1016/j.engappai.2012.10.005
Shi, 2013, Credit scoring by feature-weighted support vector machines, J. Zhejiang Univ. Sci. C, 14, 197, 10.1631/jzus.C1200205
Q. Li, J. Zhang, Y. Wang, K. Kang, Credit risk classification using discriminative restricted boltzmann machines, in: 2014 IEEE 17th International Conference on Computational Science and Engineering, 2014, pp. 1697–1700.
Maldonado, 2017, Cost-based feature selection for support vector machines: An application in credit scoring, European J. Oper. Res., 261, 656, 10.1016/j.ejor.2017.02.037
H. Sutrisno, S. Halim, Credit scoring refinement using optimized logistic regression, in: 2017 International Conference on Soft Computing, Intelligent System and Information Technology, ICSIIT, 2017, pp. 26–31.
Mancisidor, 2018
X. Zhang, Y. Yang, Z. Zhou, A novel credit scoring model based on optimized random forest, in: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference, CCWC, 2018, pp. 60–65.
Jadhav, 2018, Information gain directed genetic algorithm wrapper feature selection for credit rating, Appl. Soft Comput., 69, 541, 10.1016/j.asoc.2018.04.033
Dong, 2010, Credit scorecard based on logistic regression with random coefficients, Procedia Comput. Sci., 1, 2463, 10.1016/j.procs.2010.04.278
Twala, 2010, Multiple classifier application to credit risk assessment, Expert Syst. Appl., 37, 3326, 10.1016/j.eswa.2009.10.018
Hsieh, 2010, A data driven ensemble classifier for credit scoring analysis, Expert Syst. Appl., 37, 534, 10.1016/j.eswa.2009.05.059
Yu, 2011, Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection, Expert Syst. Appl., 38, 15392, 10.1016/j.eswa.2011.06.023
Wang, 2011, A comparative assessment of ensemble learning for credit scoring, Expert Syst. Appl., 38, 223, 10.1016/j.eswa.2010.06.048
Q. Wang, K.K. Lai, D. Niu, Green credit scoring system and its risk assessemt model with support vector machine, in: 2011 Fourth International Joint Conference on Computational Sciences and Optimization, 2011, pp. 284–287.
Ribeiro, 2011, Deep belief networks for financial prediction, 766
Yap, 2011, Using data mining to improve assessment of credit worthiness via credit scoring models, Expert Syst. Appl., 38, 13274, 10.1016/j.eswa.2011.04.147
Louzada, 2011, Poly-bagging predictors for classification modelling for credit scoring, Expert Syst. Appl., 38, 12717, 10.1016/j.eswa.2011.04.059
Marqués, 2012, Exploring the behaviour of base classifiers in credit scoring ensembles, Expert Syst. Appl., 39, 10244, 10.1016/j.eswa.2012.02.092
Marqués, 2012, Two-level classifier ensembles for credit risk assessment, Expert Syst. Appl., 39, 10916, 10.1016/j.eswa.2012.03.033
B. Tang, S. Qiu, A new credit scoring method based on improved fuzzy support vector machine, in: 2012 IEEE International Conference on Computer Science and Automation Engineering, CSAE, Vol. 3, 2012, pp. 73–75.
Louzada, 2012, On the impact of disproportional samples in credit scoring models: An application to a Brazilian bank data, Expert Syst. Appl., 39, 8071, 10.1016/j.eswa.2012.01.134
Abellán, 2014, Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring, Expert Syst. Appl., 41, 3825, 10.1016/j.eswa.2013.12.003
Harris, 2015, Credit scoring using the clustered support vector machine, Expert Syst. Appl., 42, 741, 10.1016/j.eswa.2014.08.029
B. Yi, J. Zhu, Credit scoring with an improved fuzzy support vector machine based on grey incidence analysis, in: 2015 IEEE International Conference on Grey Systems and Intelligent Services, GSIS, 2015, pp. 173–178.
Jones, 2015, An empirical evaluation of the performance of binary classifiers in the prediction of credit ratings changes, J. Bank. I Finance, 56, 72, 10.1016/j.jbankfin.2015.02.006
J. Chen, L. Xu, A method of improving credit evaluation with support vector machines, in: 2015 11th International Conference on Natural Computation, ICNC, 2015, pp. 615–619.
Zhao, 2015, Investigation and improvement of multi-layer perceptron neural networks for credit scoring, Expert Syst. Appl., 42, 3508, 10.1016/j.eswa.2014.12.006
Florez-Lopez, 2015, Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal, Expert Syst. Appl., 42, 5737, 10.1016/j.eswa.2015.02.042
Florez-Lopez, 2015, Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal, Expert Syst. Appl., 42, 5737, 10.1016/j.eswa.2015.02.042
M. Aláraj, M. Abbod, A systematic credit scoring model based on heterogeneous classifier ensembles, in: 2015 International Symposium on Innovations in Intelligent SysTems and Applications, INISTA, 2015, pp. 1–7.
Aláraj, 2016, Classifiers consensus system approach for credit scoring, Knowl.-Based Syst., 104, 89, 10.1016/j.knosys.2016.04.013
Yu, 2016, A novel multistage deep belief network based extreme learning machine ensemble learning paradigm for credit risk assessment, Flex. Serv. Manuf. J., 28, 576, 10.1007/s10696-015-9226-2
Xiao, 2016, Ensemble classification based on supervised clustering for credit scoring, Appl. Soft Comput., 43, 73, 10.1016/j.asoc.2016.02.022
Bequé, 2017, Extreme learning machines for credit scoring: An empirical evaluation, Expert Syst. Appl., 86, 42, 10.1016/j.eswa.2017.05.050
A. Lawi, F. Aziz, S. Syarif, Ensemble gradientboost for increasing classification accuracy of credit scoring, in: 2017 4th International Conference on Computer Applications and Information Processing Technology, CAIPT, 2017, pp. 1–4.
Y. Li, X. Lin, X. Wang, F. Shen, Z. Gong, Credit risk assessment algorithm using deep neural networks with clustering and merging, in: 2017 13th International Conference on Computational Intelligence and Security, CIS, 2017, pp. 173–176.
Li, 2017, Reject inference in credit scoring using semi-supervised support vector machines, Expert Syst. Appl., 74, 105, 10.1016/j.eswa.2017.01.011
O.J. Okesola, K.O. Okokpujie, A.A. Adewale, S.N. John, O. Omoruyi, An improved bank credit scoring model: A Naïve Bayesian approach, in: 2017 International Conference on Computational Science and Computational Intelligence, CSCI, 2017, pp. 228–233.
H. Chen, M. Jiang, X. Wang, Bayesian ensemble assessment for credit scoring, in: 2017 4th International Conference on Industrial Economics System and Industrial Security Engineering, IEIS, 2017, pp. 1–5.
Abellán, 2017, A comparative study on base classifiers in ensemble methods for credit scoring, Expert Syst. Appl., 73, 1, 10.1016/j.eswa.2016.12.020
Vanderheyden, 2018
Martey Addo, 2018, Credit risk analysis using machine and deep learning models, Risks, 6, 38, 10.3390/risks6020038
Xia, 2018, A novel heterogeneous ensemble credit scoring model based on bstacking approach, Expert Syst. Appl., 93, 182, 10.1016/j.eswa.2017.10.022
Chang, 2018, Application of extreme gradient boosting trees in the construction of credit risk assessment models for financial institutions, Appl. Soft Comput., 73, 10.1016/j.asoc.2018.09.029
Li, 2018, Heterogeneous ensemble for default prediction of peer-to-peer lending in China, IEEE Access, 6, 54396, 10.1109/ACCESS.2018.2810864
Cao, 2018, Performance evaluation of machine learning approaches for credit scoring, Int. J. Econ. Finance Manag. Sci., 6, 255
Basel Committee on Banking Supervision, 2006, Basel II: International convergence of capital measurement and capital standards: A revised framework - comprehensive version, bank for international settlements, BIS