A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment

Applied Soft Computing - Tập 86 - Trang 105936 - 2020

Nisha Arora¹, Pankaj Deep Kaur¹

¹Department of Computer Science and Engineering, Guru Nanak Dev University, Regional Campus, Jalandhar, India

Tóm tắt

Từ khóa

Tài liệu tham khảo

https://www.capitaline.com.

https://www.federalreserve.gov/releases/chargeoff/delallsa.htm.

Oreski, 2013, Genetic algorithm based heuristic for feature selection in credit risk assessment. sciencedirect, Expert Syst. Appl.

Dahiya, 2017, A feature selection enabled hybrid-bagging algorithm for credit risk evaluation, Expert Syst., 34, 10.1111/exsy.12217

D. Wang, Z. Zhang, A hybrid System with filter approach and multiple population Genetic Algorithm for feature selection in Credit Scoring. Science direct, J. Comput. Appl. Math. http://dx.doi.org/10.1016/j.cam.2017.04.036.

Chandrashekar, 2014

Dehuri, 2013, Revisiting evolutionary algorithms in feature selection and nonfuzzy/fuzzy rule based classification, WIRE Data Mining Knowl. Discov., 3, 83, 10.1002/widm.1087

Cai, 2018, Feature selection in machine learning: A new perspective, Neuocomputing, 10.1016/j.neucom.2017.11.077

Yue Zhang, Weihong Guo, Soumya Ray, On the consistency of Feature Selection with Lasso for Non-Linear Targets, in: Proceedings of The 33rd International Conference on Machine Learning, PMLR, vol. 48, 2016, pp. 183–191.

Huang, 2018, Enterprise credit risk evaluation based on neural network algorithm, Cogn. Syst. Res., 10.1016/j.cogsys.2018.07.023

Pandey, 2017, Credit risk analysis using machine learning classifiers

Lessmann, 2015, Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research, European J. Oper. Res., 247, 124, 10.1016/j.ejor.2015.05.030

Lin, 2012, Machine learning in financial crisis prediction: A survey, IEEE Trans. Syst. Man Cybern. C, 42, 421, 10.1109/TSMCC.2011.2170420

Kruppa, 2013, Consumer credit risk: Individual probability estimates using machine learning, Expert Syst. Appl., 40, 5125, 10.1016/j.eswa.2013.03.019

Malekipirbazari, 2015, Risk assessment in social lending via random forests, Expert Syst. Appl., 42, 4621, 10.1016/j.eswa.2015.02.001

Shi, 2011, Credit assessment with random forests, 24

Behr, 2016, Default patterns in seven EU countries: A random forest approach, Int. J. Econ. Bus., 24, 181, 10.1080/13571516.2016.1252532

Bingamawa, 2016

Antonakis, 2009, Assessing Naïve Bayes as a method for screening credit applicants, J. Appl. Stat., 5, 537, 10.1080/02664760802554263

Yeh, 2009, The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients, Expert Syst. Appl., 36, 2473, 10.1016/j.eswa.2007.12.020

Danenas, 2011, Credit risk evaluation model development using support vector based classifiers, Procedia Comput. Sci., 4, 1699, 10.1016/j.procs.2011.04.184

Danenas, 2015, Selection of support vector machines based classifiers for credit risk domain, Expert Syst. Appl., 42, 3194, 10.1016/j.eswa.2014.12.001

Sivasankar, 2017, A study of dimensionality reduction techniques with machine learning methods for credit risk prediction, vol. 556

Jiang, 2018, Stationary Mahalanobis kernel SVM for credit risk evaluation, Appl. Soft Comput., 71, 10.1016/j.asoc.2018.07.005

Henley, 1997, Construction of a k-nearest-neighbour credit-scoring system, IMA J. Manag. Math., 8, 305, 10.1093/imaman/8.4.305

Baesens, 2003, Benchmarking state of the art classification algorithm for credit scoring, J Oper Res Soc, 54, 627, 10.1057/palgrave.jors.2601545

Li, 2009, The hybrid credit scoring model based on KNN classifier, 330

Hand, 2003, Choosing k for two-class nearest neighbor classifiers with unbalanced classes, Pattern Recognit. Lett., 24, 1555, 10.1016/S0167-8655(02)00394-X

Islam, 2007, Investigating the performance of naive- Bayes classifiers and k- nearest neighbor classifiers

Liu, 2005, Data mining feature selection for credit scoring models, J. Oper. Res. Soc., 56, 1099, 10.1057/palgrave.jors.2601976

Jadhav, 2018, Information gain directed genetic algorithm wrapper feature selection for credit rating, Appl. Soft Comput., 69, 541, 10.1016/j.asoc.2018.04.033

X. Zhang, Y. Yang, Z. Zhou, A novel credit scoring model based on optimized random forest, in: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference, CCWC, Las Vegas, NV, 2018, pp. 60–65. http://dx.doi.org/10.1109/CCWC.2018.8301707.

Xiao-Ying liu, Yong Liang, et al. A Hybrid Genetic Algorithm withWrapper-embedded Approaches for Feature Selection, IEEE Access. http://dx.doi.org/10.1109/ACCESS.2018.2818682.

Khaire, 2019, Stability of feature selection algorithm: A review, J. King Saud Univ. - Comput. Inf. Sci.

Somol, 2010, Evaluating stability and comparing output of feature selectors that optimize feature subset Cardinality, IEEE Trans. Pattern Anal. Mach. Intell., 32, 1921, 10.1109/TPAMI.2010.34

L.I. Kuncheva, A stability index for feature selection, in: Proc. 25th IASTED Int’l Multi-Conf. Artificial Intelligence and Applications, 2007, pp. 421–427.

I. Kamkar, S.K. Gupta, D. Phung, S. Venkatesh, Exploiting feature relationships towards stable feature selection, in: 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Paris, 2015, pp. 1-10. http://dx.doi.org/10.1109/DSAA.2015.7344859.

Pes, 2019, Ensemble feature selection for high-dimensional data: A stability analysis across multiple domains, Neural Comput. Appl., 10.1007/s00521-019-04082-3

Turney, 1995, Technical note: Bias and the quantification of stability, Mach. Learn., 20, 23, 10.1007/BF00993473

Z. He, W. Yu, Stable feature selection for biomarker discovery, in: Computaional Biology and Discovery, Elsevier. http://dx.doi.org/10.1016/j.compbiolchem.2010.07.002.

Abeel, 2010, Robust biomarker identification for cancer diagnosis with ensemble feature selection methods, Bioinformatics, 26, 392, 10.1093/bioinformatics/btp630

Kamkar, 2015, 9457

Li, 2015, FREL: A stable feature selection algorithm, IEEE Trans. Neural Netw. Learn. Syst., 26, 1388, 10.1109/TNNLS.2014.2341627

Han, 2010, A variance reduction framework for stable feature selection

Robert, 1996, Regression shrinkage and selection via the lasso, J. Royal. Stat. Soc. Ser. B, 58, 267, 10.1111/j.2517-6161.1996.tb02080.x

Lin, 2014, A new idea of study on the influence factors of companies’ debt costs in the big data era, Procedia Comput. Sci., 31, 532, 10.1016/j.procs.2014.05.299

Fang, 2014, Individual credit risk prediction model: Application of lasso-logistic model, J. Quant. Tech. Econ.

Hongmei Chen, Yaoxin Xiang, The study of credit scoring model based on group lasso, in: Procedia, Sciencedirect, 5th International Conference on Information Technology and quantitative Management, ITQM, 2017.

Kamkar, 2015, Stable feature selection for clinical prediction: Exploiting ICD tree structure using tree-lasso, J. Biomed. Inform., 10.1016/j.jbi.2014.11.013

Zhang, 2016, High-order covariate interacted lasso for feature selection, Pattern Recognit. Lett.

Bach, 2008

Liu, 1995, Chi2 : Feature selection and discretization of the numeric attributes, 388

Trabelsia, 2017, A new feature selection method for nominal classifier based on formal concept analysis, Procedia Comput. Sci., 6

McHugh, 2013, The chi-square test of independence, Biochemiamedica, 143

H. Dağ, K.E. Sayin, I. Yenidoğan, S. Albayrak, C. Acar, Comparison of feature selection algorithms for medical data, in: 2012 International Symposium on Innovations in Intelligent Systems and Applications, Trabzon, 2012, pp. 1–5. http://dx.doi.org/10.1109/INISTA.2012.6247011.

Robnik-Šikonja, 2003, Mach. Learn., 53, 23, 10.1023/A:1025667309714

Liu, 2005, Data mining feature selection for credit scoring models, J. Oper. Res. Soc., 56, 1099, 10.1057/palgrave.jors.2601976

Cortes, 1995, Mach. Learn., 20, 273

Carrizosa, 2013, Supervised classification and mathematical optimization, Comput. Oper. Res., 40, 150, 10.1016/j.cor.2012.05.015

G.H. John, P. Langley, Estimating continuous distribution in bayesian classifier, in: Proceedings on 11th Conference in Uncertainty in Artificial Intelligence, 1995, pp. 338–345.

Zareapoor Masoumeh, Pourya Shamsolmoali, Application of Credit Card Fraud Detection: Based on Bagging Ensemble Classifier, in: International Conference on Computer, Communication and Convergence, ICCC 2015, Procedia Computer Science http://dx.doi.org/10.1016/j.procs.2015.04.201. http://www.sciencedirect.com/science/article/pii/S1877050915007103.

Mase, 2008, Credit-rating of companies

L., 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324

https://www.lendingclub.com/info/download-data.action.

https://www.kaggle.com/zaurbegiev/my-dataset.

https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data).

Core Team, 2018

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA