A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring
Tóm tắt
Từ khóa
Tài liệu tham khảo
Ala'raj, 2016, Classifiers consensus system approach for credit scoring, Knowledge-Based Systems, 104, 89, 10.1016/j.knosys.2016.04.013
Ala'raj, 2016, A new hybrid ensemble credit scoring model based on classifiers consensus system approach, Expert Systems with Applications, 64, 36, 10.1016/j.eswa.2016.07.017
Altman, 1968, Financial ratios, discriminant analysis and the prediction of corporate bankruptcy, The Journal of Finance, 23, 589, 10.1111/j.1540-6261.1968.tb00843.x
Baesens, 2003, Benchmarking state-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society, 54, 627, 10.1057/palgrave.jors.2601545
Bauer, 1999, An empirical comparison of voting classification algorithms: Bagging, boosting, and variants, Machine Learning, 36, 105, 10.1023/A:1007515423169
Bergstra, 2012, Random search for hyper-parameter optimization, Journal of Machine Learning Research, 13, 281
Bergstra, 2013, Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms, 13, 10.25080/Majora-8b375195-003
Bergstra, 2011, Algorithms for hyper-parameter optimization, 2546
Bishop, 2006
Breiman, 1984
Brillante, 2015, Investigating the use of gradient boosting machine, random forest and their ensemble to predict skin flavonoid content from berry physical–mechanical characteristics in wine grapes, Computers and Electronics in Agriculture, 117, 186, 10.1016/j.compag.2015.07.017
Brown, 2012, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, 39, 3446, 10.1016/j.eswa.2011.09.033
Chang, 2011, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology (TIST), 2, 27
Chen, 2015, Measuring the curse of dimensionality and its effects on particle swarm optimization and differential evolution, Applied Intelligence, 42, 514, 10.1007/s10489-014-0613-2
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. arXiv preprint arXiv:1603.02754.
Chen, T., & He, T. (2015). xgboost: EXtreme Gradient Boosting. R package version 0.4-2.
Chen, 2016, Group social capital and lending outcomes in the financial credit market: An empirical study of online peer-to-peer lending, Electronic Commerce Research and Applications, 15, 1, 10.1016/j.elerap.2015.11.003
Demšar, 2006, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1
Demšar, 2006, Statistical comparisons of classifiers over multiple data sets, The Journal of Machine Learning Research, 7, 1
Dietterich, 2000, An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization, Machine learning, 40, 139, 10.1023/A:1007607513941
Duin, 2000, Experiments with classifier combining rules, 16
Elith, 2008, A working guide to boosted regression trees, Journal of Animal Ecology, 77, 802, 10.1111/j.1365-2656.2008.01390.x
Finlay, 2011, Multiple classifier architectures and their application to credit risk assessment, European Journal of Operational Research, 210, 368, 10.1016/j.ejor.2010.09.029
Florez-Lopez, 2015, Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal, Expert Systems with Applications, 42, 5737, 10.1016/j.eswa.2015.02.042
Freund, 1995, A desicion-theoretic generalization of on-line learning and an application to boosting, 23
Friedman, 2001, Greedy function approximation: A gradient boosting machine, Annals of statistics, 29, 1189, 10.1214/aos/1013203451
Friedman, 2002, Stochastic gradient boosting, Computational Statistics & Data Analysis, 38, 367, 10.1016/S0167-9473(01)00065-2
Guelman, 2012, Gradient boosting trees for auto insurance loss cost modeling and prediction, Expert Systems with Applications, 39, 3659, 10.1016/j.eswa.2011.09.058
Guo, 2016, Instance-based credit risk assessment for investment decisions in P2P lending, European Journal of Operational Research, 249, 417, 10.1016/j.ejor.2015.05.050
Hand, 2006, Classifier technology and the illusion of progress, Statistical science, 21, 1, 10.1214/088342306000000060
Hand, 2009, Measuring classifier performance: A coherent alternative to the area under the ROC curve, Machine learning, 77, 103, 10.1007/s10994-009-5119-5
Hand, 1997, Statistical classification methods in consumer credit scoring: A review, Journal of the Royal Statistical Society: Series A (Statistics in Society), 160, 523, 10.1111/j.1467-985X.1997.00078.x
Harris, 2015, Credit scoring using the clustered support vector machine, Expert Systems with Applications, 42, 741, 10.1016/j.eswa.2014.08.029
Hastie, 2009, Boosting and additive trees, 337
Huang, 2007, Credit scoring with a data mining approach based on support vector machines, Expert Systems with Applications, 33, 847, 10.1016/j.eswa.2006.07.007
Hutter, 2011, Sequential model-based optimization for general algorithm configuration, 507
Johnson, 2014, Learning nonlinear functions using regularized greedy forest, IEEE transactions on pattern analysis and machine intelligence, 36, 942, 10.1109/TPAMI.2013.159
Lee, 2006, Mining the customer credit using classification and regression tree and multivariate adaptive regression splines, Computational Statistics & Data Analysis, 50, 1113, 10.1016/j.csda.2004.11.006
Lessmann, 2015, Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research, European Journal of Operational Research, 247, 124, 10.1016/j.ejor.2015.05.030
Lin, 2008, Particle swarm optimization for parameter determination and feature selection of support vector machines, Expert Systems with Applications, 35, 1817, 10.1016/j.eswa.2007.08.088
Min, 2005, Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters, Expert Systems with Applications, 28, 603, 10.1016/j.eswa.2004.12.008
Nanni, 2009, An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring, Expert Systems with Applications, 36, 3028, 10.1016/j.eswa.2008.01.018
Nascimento, 2014, Integrating complementary techniques for promoting diversity in classifier ensembles: A systematic study, Neurocomputing, 138, 347, 10.1016/j.neucom.2014.01.027
Nie, 2011, Credit card churn forecasting by logistic regression and decision tree, Expert Systems with Applications, 38, 15273, 10.1016/j.eswa.2011.06.028
Pötzsch, 2010, The role of soft information in trust building: Evidence from online social lending, 381
Paleologo, 2010, Subagging for credit scoring models, European Journal of Operational Research, 201, 490, 10.1016/j.ejor.2009.03.008
Pedregosa, 2011, Scikit-learn: Machine learning in Python, Journal of Machine learning research, 12, 2825
Serrano-Cinca, 2016, The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending, Decision Support Systems, 89, 113, 10.1016/j.dss.2016.06.014
Simon, 2008
Snoek, 2012, Practical bayesian optimization of machine learning algorithms, 2951
Sutton, 2005, 11-classification and regression trees, bagging, and boosting, Handbook of Statistics, 24, 303, 10.1016/S0169-7161(04)24011-1
Thornton, 2013, Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms, 847
Tsai, 2014, A comparative study of classifier ensembles for bankruptcy prediction, Applied Soft Computing, 24, 977, 10.1016/j.asoc.2014.08.047
Twala, 2010, Multiple classifier application to credit risk assessment, Expert Systems with Applications, 37, 3326, 10.1016/j.eswa.2009.10.018
Wang, 2011, A comparative assessment of ensemble learning for credit scoring, Expert Systems with Applications, 38, 223, 10.1016/j.eswa.2010.06.048
Wang, 2012, Two credit scoring models based on dual strategy ensemble trees, Knowledge-Based Systems, 26, 61, 10.1016/j.knosys.2011.06.020
West, 2000, Neural network credit scoring models, Computers & operations research, 27, 1131, 10.1016/S0305-0548(99)00149-5
Wiginton, 1980, A note on the comparison of logit and discriminant models of consumer credit behavior, Journal of Financial and Quantitative Analysis, 15, 757, 10.2307/2330408
Wolpert, 1997, No free lunch theorems for optimization, IEEE Transactions on Evolutionary Computation, 1, 67, 10.1109/4235.585893
Wu, 2012, Credit risk assessment and decision making by a fusion approach, Knowledge-Based Systems, 35, 102, 10.1016/j.knosys.2012.04.025
Yeh, 2012, A hybrid KMV model, random forests and rough set theory approach for credit rating, Knowledge-Based Systems, 33, 166, 10.1016/j.knosys.2012.04.004
Zhang, 2015, A gradient boosting method to improve travel time prediction, Transportation Research Part C: Emerging Technologies, 58, 308, 10.1016/j.trc.2015.02.019
Zhang, 2016, Research on Credit Scoring by Fusing Social Media Information in Online Peer-to-Peer Lending, Procedia Computer Science, 91, 168, 10.1016/j.procs.2016.07.055