Resampling ensemble model based on data distribution for imbalanced credit risk evaluation in P2P lending

Information Sciences - Tập 536 - Trang 120-134 - 2020
Kun Niu1, Zaimei Zhang2, Yan Liu1, Renfa Li1
1Key Laboratory for Embedded and Network Computing of Hunan Province, Hunan University, Changsha, Hunan, China
2Changsha University of Science and Technology, Changsha, Hunan, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Bastani, 2019, Wide and deep learning for peer-to-peer lending, Expert Syst. Appl., 134, 209, 10.1016/j.eswa.2019.05.042

Xia, 2018, A rejection inference technique based on contrastive pessimistic likelihood estimation for p2p lending, Electron. Commer. Res. Appl., 30, 111, 10.1016/j.elerap.2018.05.011

Mild, 2015, How low can you go? – overcoming the inability of lenders to set proper interest rates on unsecured peer-to-peer lending markets, J. Business Res., 68, 1291, 10.1016/j.jbusres.2014.11.021

Li, 2017, Reject inference in credit scoring using semi-supervised support vector machines, Expert Syst. Appl., 74, 105, 10.1016/j.eswa.2017.01.011

Chen, 2018, Gflink: an in-memory computing architecture on heterogeneous cpu-gpu clusters for big data, IEEE Trans. Parallel Distrib. Syst., 29, 1275, 10.1109/TPDS.2018.2794343

Chen, 2019, Performance-aware model for sparse matrix-matrix multiplication on the sunway taihulight supercomputer, IEEE Trans. Parallel Distrib. Syst., 30, 923, 10.1109/TPDS.2018.2871189

Okesola, 2017, An improved bank credit scoring model: a naive Bayesian approach, 228

Hand, 2018, Superscorecards, Ima J. Manage. Math., 13, 273, 10.1093/imaman/13.4.273

Altman, 2002, Modeling credit risk for smes: evidence from the us market, New York, 19, 1

West, 2000, Neural network credit scoring models, Comput. Oper. Res., 27, 1131, 10.1016/S0305-0548(99)00149-5

Xia, 2018, A novel heterogeneous ensemble credit scoring model based on bstacking approach, Expert Syst. Appl., 93, 182, 10.1016/j.eswa.2017.10.022

Zhang, 2015, Maximizing reliability with energy conservation for parallel task scheduling in a heterogeneous cluster, Inf. Sci., 319, 113, 10.1016/j.ins.2015.02.023

Zhou, 2008, A new credit scoring method based on rough sets and decision tree, 1081

Henley, 1996, A k-nearest-neighbour classifier for assessing consumer credit risk, J. Roy. Stat. Soc., 45, 77

Li, 2017, Reject inference in credit scoring using semi-supervised support vector machines, Expert Syst. Appl., 74, 105, 10.1016/j.eswa.2017.01.011

Cao, 2017, 2,1 norm regularized multi-kernel based joint nonlinear feature selection and over-sampling for imbalanced data classification, Neurocomputing, 234, 38, 10.1016/j.neucom.2016.12.036

Brown, 2012, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Syst. Appl., 39, 3446, 10.1016/j.eswa.2011.09.033

Yu, 2018, A dbn-based resampling svm ensemble learning paradigm for credit classification with imbalanced data, Appl. Soft Comput., 69, 192, 10.1016/j.asoc.2018.04.049

Sun, 2018, Imbalanced enterprise credit evaluation with dte-sbd: decision tree ensemble based on smote and bagging with differentiated sampling rates, Inf. Sci., 425, 76, 10.1016/j.ins.2017.10.017

He, 2018, A novel ensemble method for credit scoring: adaption of different imbalance ratios, Expert Syst. Appl., 98, 105, 10.1016/j.eswa.2018.01.012

Jian, 2016, A new sampling method for classifying imbalanced data based on support vector machine ensemble, Neurocomputing, 193, 115, 10.1016/j.neucom.2016.02.006

Chawla, 2002, Smote: synthetic minority over-sampling technique, J. Artif. Intell. Res. (JAIR), 16, 321, 10.1613/jair.953

Han, 2005, Borderline-smote: a new over-sampling method in imbalanced data sets learning, 878

Liu, 2009, Exploratory undersampling for class-imbalance learning, Syst. Man Cybern. Part B, 39, 539, 10.1109/TSMCB.2008.2007853

Hualong, 2013, Acosampling: an ant colony optimization-based undersampling method for classifying imbalanced dna microarray data, Neurocomputing, 101, 309, 10.1016/j.neucom.2012.08.018

Sun, 2007, Cost-sensitive boosting for classification of imbalanced data, Pattern Recogn., 40, 3358, 10.1016/j.patcog.2007.04.009

Fan, 1999, Adacost: misclassification cost-sensitive boosting, 97

Zhou, 2016, Top k favorite probabilistic products queries, IEEE Trans. Knowl. Data Eng., 28, 2808, 10.1109/TKDE.2016.2584606

Nami, 2018, Cost-sensitive payment card fraud detection based on dynamic random forest and k-nearest neighbors, Expert Syst. Appl., 110, 381, 10.1016/j.eswa.2018.06.011

A.G.C.D. S, A.C.M. Pereira, G.L. Pappa, A customized classification algorithm for credit card fraud detection, Eng. Appl. Artif. Intell. 72(C) (2018) 21–29.

Yu, 2015, Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data, Knowl.-Based Syst., 76, 67, 10.1016/j.knosys.2014.12.007

Manevitz, 2002, One-class svms for document classification, J. Mach. Learn. Res., 2, 139

Shen, 2019, A novel ensemble classification model based on neural networks and a classifier optimisation technique for imbalanced credit risk evaluation, Physica A, 526, 10.1016/j.physa.2019.121073

Akbani, 2004, Applying support vector machines to imbalanced datasets, 39

Argamon-Engelson, 1999, Committee-based sample selection for probabilistic classifiers, J. Artif. Int. Res., 11, 335

Wang, 2009, Diversity analysis on imbalanced data sets by using ensemble models, in, 324

Chen, 2004

S.-J. Yen, Y.-S. Lee, Cluster-based under-sampling approaches for imbalanced data distributions, Expert Syst. Appl. 36(3, Part 1) (2009) 5718–5727.

Fisher, 1936, The use of multiple measurements in taxonomic problems, Ann. Eugenics, 7, 179, 10.1111/j.1469-1809.1936.tb02137.x

Orgler, 1970, A credit scoring model for commercial loans, J. Money Credit Bank., 2, 435, 10.2307/1991095

Young Sohn, 2016, Technology credit scoring model with fuzzy logistic regression, Appl. Soft Comput., 43, 150, 10.1016/j.asoc.2016.02.025

Bellotti, 2009, Support vector machines for credit scoring and discovery of significant features, Expert Syst. Appl., 36, 3302, 10.1016/j.eswa.2008.01.005

Zhang, 2018, Classifier selection and clustering with fuzzy assignment in ensemble model for credit scoring, Neurocomputing, 316, 210, 10.1016/j.neucom.2018.07.070

Tang, 2019, Applying a nonparametric random forest algorithm to assess the credit risk of the energy industry in china, Technol. Forecast. Soc. Chang., 144, 563, 10.1016/j.techfore.2018.03.007

Malekipirbazari, 2015, Risk assessment in social lending via random forests, Expert Syst. Appl., 42, 4621, 10.1016/j.eswa.2015.02.001

Plawiak, 2019, Application of new deep genetic cascade ensemble of svm classifiers to predict the australian credit scoring, Appl. Soft Comput., 84, 10.1016/j.asoc.2019.105740

Breiman, 1996, Bagging predictors, Mach. Learn., 24, 123, 10.1007/BF00058655

Fawcett, 2004, Roc graphs: notes and practical considerations for researchers, Mach. Learn., 31, 1