Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending

Electronic Commerce Research and Applications - Tập 24 - Trang 30-49 - 2017
Yufei Xia1, Chuanzhe Liu1, Nana Liu1
1School of Management, China University of Mining and Technology, Xuzhou, Jiangsu 221116, PR China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Alejo, 2013, 1

Bahnsen, A.C., Aouada, D., Ottersten, B., 2014a. Example-dependent cost-sensitive logistic regression for credit scoring. In: Proceedings of 13th International Conference on Machine Learning and Applications (ICMLA). IEEE, pp. 263–269.

Bahnsen, 2014, 263

Bahnsen, 2015, Example-dependent cost-sensitive decision trees, Expert Syst. Appl., 42, 6609, 10.1016/j.eswa.2015.04.042

Bauer, 1999, An empirical comparison of voting classification algorithms: Bagging, boosting, and variants, Mach. Learn., 36, 105, 10.1023/A:1007515423169

Bergstra, J.S., Bardenet, R., Bengio, Y., Kégl, B., 2011. Algorithms for hyper-parameter optimization. In: Proceedings of Advances in Neural Information Processing Systems.NIPS, Spain Granada, pp. 2546–2554.

Bishop, 2006

Byanjankar, A., Heikkilä, M., Mezei, J., 2015. Predicting credit risk in peer-to-peer lending: A neural network approach. In: Computational Intelligence, 2015 IEEE Symposium Series on IEEE, pp. 719–725.

Caruana, R., Niculescu-Mizil, A., 2006. An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp. 161–168.

Chen, N., Vieira, A., Duarte, J., 2009. Cost-sensitive LVQ for bankruptcy prediction: An empirical study, Computer Science and Information Technology. In: 2nd IEEE International Conference on IEEE, pp. 115–119.

Chen, T., Guestrin, C., 2016a. Xgboost: A scalable tree boosting system. arXiv preprint arXiv:1603.02754.

Chen, T., Guestrin, C., 2016b. Xgboost: A scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp. 785–794.

Chen, T., He, T., 2015. xgboost: eXtreme Gradient Boosting. R package version 0.4-2.

Crean, 2005, Point of view revealing the true meaning of the IRR via profiling the IRR and defining the ERR, J. Real Estate Portf. Manage., 11, 323, 10.1080/10835547.2005.12089725

Crone, 2012, Instance sampling in credit scoring: An empirical study of sample size and balancing, Int. J. Forecasting, 28, 224, 10.1016/j.ijforecast.2011.07.006

Dietterich, 2000, An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization, Mach. Learn., 40, 139, 10.1023/A:1007607513941

Elkan, C., 2001. The foundations of cost-sensitive learning. In: International joint conference on artificial intelligence. Lawrence Erlbaum Associates Ltd, pp. 973–978.

Emekter, 2015, Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending, Appl. Econ., 47, 54, 10.1080/00036846.2014.962222

Fan, W., Stolfo, S.J., 1999. AdaCost: misclassification cost-sensitive boosting. In: Proceedings of 16th International Conference on Machine Learning. pp. 97–105.

Finlay, 2010, Credit scoring for profitability objectives, Eur. J. Oper. Res., 202, 528, 10.1016/j.ejor.2009.05.025

Finlay, 2011, Multiple classifier architectures and their application to credit risk assessment, Eur. J. Oper. Res., 210, 368, 10.1016/j.ejor.2010.09.029

Friedman, 2001, Greedy function approximation: a gradient boosting machine, Ann. Stat., 1189, 10.1214/aos/1013203451

García, 2012, Improving risk predictions by preprocessing imbalanced credit data, 68

Gonzalez, 2014, When can a photo increase credit? The impact of lender and borrower profiles on online peer-to-peer loans, J. Behav. Exp. Financ., 2, 44, 10.1016/j.jbef.2014.04.002

Guo, 2016, Instance-based credit risk assessment for investment decisions in P2P lending, Eur. J. Oper. Res., 249, 417, 10.1016/j.ejor.2015.05.050

Hand, 2005, Good practice in retail credit scorecard assessment, J. Oper. Res. Soc., 56, 1109, 10.1057/palgrave.jors.2601932

Hand, 2009, Measuring classifier performance: a coherent alternative to the area under the ROC curve, Mach. Learn., 77, 103, 10.1007/s10994-009-5119-5

Herzenstein, 2011, Tell me a good story and I may lend you money: The role of narratives in peer-to-peer lending decisions, J. Mark. Res., 48, 138, 10.1509/jmkr.48.SPL.S138

Ho, T.K., 1995. Random decision forests. In: Document Analysis and Recognition, 1995, Proceedings of the Third International Conference on.IEEE, pp. 278–282.

Huysmans, 2011, An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models, Decis. Support Syst., 51, 141, 10.1016/j.dss.2010.12.003

James, 2013

Johnson, 2014, Learning nonlinear functions using regularized greedy forest, IEEE Trans. Pattern Anal. Mach. Intell., 36, 942, 10.1109/TPAMI.2013.159

Khashman, 2010, Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes, Expert Syst. Appl., 37, 6233, 10.1016/j.eswa.2010.02.101

Kim, 2012, Classification cost: An empirical comparison among traditional classifier, Cost-Sensitive Classifier, and MetaCost, Expert Syst. Appl., 39, 4013, 10.1016/j.eswa.2011.09.071

Klafft, M., 2008. Peer to peer lending: auctioning microcredits over the internet. In: Proceedings of the International Conference on Information Systems, Technology and Management.

Kohavi, R., Wolpert, D.H., 1996. Bias plus variance decomposition for zero-one loss functions. In: ICML. pp. 275–283.

Kriegler, 2010, Small area estimation of the homeless in Los Angeles: An application of cost-sensitive stochastic gradient boosting, Ann. Appl. Stat., 1234, 10.1214/10-AOAS328

Lee, 2012, Herding behavior in online P2P lending: An empirical investigation, Electron. Commer. Res. Appl., 11, 495, 10.1016/j.elerap.2012.02.001

Lessmann, 2015, Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research, Eur. J. Oper. Res., 247, 124, 10.1016/j.ejor.2015.05.030

Ling, 2011, 231

Lomax, 2013, A survey of cost-sensitive decision tree induction algorithms, ACM Comput. Surv. (CSUR), 45, 16, 10.1145/2431211.2431215

Magee, 2011, Peer-to-Peer Lending in the United States: Surviving after Dodd-Frank, North Carolina Banking Institute, 15, 139

Magni, 2013, The internal rate of return approach and the AIRR paradigm: a refutation and a corroboration, Eng. Econ., 58, 73, 10.1080/0013791X.2012.745916

Malekipirbazari, 2015, Risk assessment in social lending via random forests, Expert Syst. Appl., 42, 4621, 10.1016/j.eswa.2015.02.001

Markowitz, 1952, Portfolio selection, J. Financ., 7, 77

Marqués, 2013, On the suitability of resampling techniques for the class imbalance problem in credit scoring, J. Oper. Res. Soc., 64, 1060, 10.1057/jors.2012.120

Masnadi-Shirazi, 2011, Cost-sensitive boosting, IEEE Trans. Pattern Anal. Mach. Intell., 33, 294, 10.1109/TPAMI.2010.71

Mild, 2015, How low can you go?—Overcoming the inability of lenders to set proper interest rates on unsecured peer-to-peer lending markets, J. Bus. Res., 68, 1291, 10.1016/j.jbusres.2014.11.021

Min, 2005, Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters, Expert Syst. Appl., 28, 603, 10.1016/j.eswa.2004.12.008

Nikolaou, 2016, Cost-sensitive boosting algorithms: Do we really need them?, Mach. Learn., 104, 359, 10.1007/s10994-016-5572-x

Platt, 1999, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers, 10, 61

Safavian, 1991, A survey of decision tree classifier methodology, IEEE Trans. Syst. Man Cyber., 21, 660, 10.1109/21.97458

Sahin, 2013, A cost-sensitive decision tree approach for fraud detection, Expert Syst. Appl., 40, 5916, 10.1016/j.eswa.2013.05.021

Seiffert, 2010, RUSBoost: A hybrid approach to alleviating class imbalance, IEEE Trans. Syst. Man Cybern. Part A Syst. Hum., 40, 185, 10.1109/TSMCA.2009.2029559

Serrano-Cinca, 2016, The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending, Decis. Support Syst., 89, 113, 10.1016/j.dss.2016.06.014

Sun, 2007, Cost-sensitive boosting for classification of imbalanced data, Pattern Recogn., 40, 3358, 10.1016/j.patcog.2007.04.009

Sun, Y., Wong, A.K., Wang, Y., 2005. Parameter inference of cost-sensitive boosting algorithms. In: Proceedings of International Workshop on Machine Learning and Data Mining in Pattern Recognition. Springer, pp. 21–30.

Twala, 2010, Multiple classifier application to credit risk assessment, Expert Syst. Appl., 37, 3326, 10.1016/j.eswa.2009.10.018

Verbeke, 2012, New insights into churn prediction in the telecommunication sector: A profit driven data mining approach, Eur. J. Oper. Res., 218, 211, 10.1016/j.ejor.2011.09.031

Verbraken, 2014, Development and application of consumer credit scoring models using profit-based classification measures, Eur. J. Oper. Res., 238, 505, 10.1016/j.ejor.2014.04.001

Wang, 2012, Two credit scoring models based on dual strategy ensemble trees, Knowledge-Based Syst., 26, 61, 10.1016/j.knosys.2011.06.020

Wei, 2015, Internet lending in China: Status quo, potential risks and regulatory options, Comput. Law & Secur. Rev., 31, 793, 10.1016/j.clsr.2015.08.005

West, 2005, Neural network ensemble strategies for financial decision applications, Comput. & Oper. Res., 32, 2543, 10.1016/j.cor.2004.03.017

Wiginton, 1980, A note on the comparison of logit and discriminant models of consumer credit behavior, J. Financ. Quant. Anal., 15, 757, 10.2307/2330408

Xia, 2017, A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring, Expert Syst. Appl., 78, 225, 10.1016/j.eswa.2017.02.017

Xiao, 2012, Dynamic classifier ensemble model for customer classification with imbalanced class distribution, Expert Syst. Appl., 39, 3668, 10.1016/j.eswa.2011.09.059

Yu, 2010, Support vector machine based multiagent ensemble learning for credit risk evaluation, Expert Syst. Appl., 37, 1351, 10.1016/j.eswa.2009.06.083

Yum, 2012, From the wisdom of crowds to my own judgment in microfinance through online peer-to-peer lending platforms, Electron. Commer. Res. Appl., 11, 469, 10.1016/j.elerap.2012.05.003

Zhao, H., Liu, Q., Wang, G., Ge, Y., Chen, E., 2016. Portfolio selections in P2P lending: A multi-objective perspective. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM, pp. 2075–2084.

Zhao, H., Wu, L., Liu, Q., Ge, Y., Chen, E., 2014. Investment recommendation in p2p lending: A portfolio perspective with risk management. In: Data Mining (ICDM), 2014 IEEE International Conference on.IEEE, pp. 1109–1114.

Zięba, 2016, Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction, Expert Syst. Appl., 58, 93, 10.1016/j.eswa.2016.04.001