Machine learning-driven credit risk: a systemic review

Neural Computing and Applications - Tập 34 - Trang 14327-14339 - 2022
Si Shi1, Rita Tse1,2, Wuman Luo1, Stefano D’Addona3, Giovanni Pau4,5
1Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR, China
2Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence of Ministry of Education, Macao Polytechnic University, Macao SAR, China
3Department of Political Science, University of Roma Tre, Rome, Italy
4Department of Computer Science and Engineering, University of Bologna, Bologna, Italy
5UCLA Samueli Computer Science, University of California, Los Angeles, USA

Tóm tắt

Credit risk assessment is at the core of modern economies. Traditionally, it is measured by statistical methods and manual auditing. Recent advances in financial artificial intelligence stemmed from a new wave of machine learning (ML)-driven credit risk models that gained tremendous attention from both industry and academia. In this paper, we systematically review a series of major research contributions (76 papers) over the past eight years using statistical, machine learning and deep learning techniques to address the problems of credit risk. Specifically, we propose a novel classification methodology for ML-driven credit risk algorithms and their performance ranking using public datasets. We further discuss the challenges including data imbalance, dataset inconsistency, model transparency, and inadequate utilization of deep learning models. The results of our review show that: 1) most deep learning models outperform classic machine learning and statistical algorithms in credit risk estimation, and 2) ensemble methods provide higher accuracy compared with single models. Finally, we present summary tables in terms of datasets and proposed models.

Tài liệu tham khảo

Breiman L (2001) Random forests. Mach Learn 45(1):5–32

Goodfellow I, Bengio Y, Courville A (2016) Deep Learn. MIT press, Cambridge

Golbayani P, Wang D, Florescu I (2020) Application of deep neural networks to assess corporate credit rating. arXiv preprint arXiv:2003.02334

Galindo J, Tamayo P (2000) Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications. Comput Econ 15(1):107–143

Quinlan JR (1993) C4. 5: Programming for machine learning. Morgan Kauffmann 38(48):49

Breimann L, Friedman JH, Olshen RA et al (1984) Classif Regres Trees. Wadsworth, Pacific Grove

Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794

Holland JH (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. U Michigan Press

Hochreiter S, Schmidhuber J (1997) Lstm can solve hard long time lag problems. Advances in neural information processing systems pp 473–479

Quinlan JR et al (1996) Bagging, boosting, and c4. 5. Aaai/iaai 1:725–730

Kumar A (2022) The ultimate guide to adaboost algorithm : What is adaboost algorithm? https://www.mygreatlearning.com/blog/adaboost-algorithm/. Accessed 27 March 2022

Muthee A (2021) The basics of genetic algorithms in machine learning. https://www.section.io/engineering-education/the-basics-of-genetic-algorithms-in-ml/. Accessed 27 March 2022

Zhang A, Lipton ZC, Li M, et al (2021) Dive into deep learning. arXiv preprint arXiv:2106.11342

Babaev D, Savchenko M, Tuzhilin A, et al (2019) Et-rnn: Applying deep learning to credit loan applications. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2183–2190

Sirignano J, Sadhwani A, Giesecke K (2016) Deep learning for mortgage risk. arXiv preprint arXiv:1607.02470

Ostapchenya D (2021) The role of big data in banking : How do modern banks use big data? https://www.finextra.com/blogposting/20446/the-role-of-big-data-in-banking--how-do-modern-banks-use-big-data. Accessed 27 March 2022

Wang H (2021) Credit risk management of consumer finance based on big data. Mobile Information Systems 2021