A framework to improve churn prediction performance in retail banking

Financial Innovation - Tập 10 - Trang 1-29 - 2024
João B. G. Brito1, Guilherme B. Bucco2, Rodrigo Heldt2, João L. Becker3, Cleo S. Silveira2, Fernando B. Luce2, Michel J. Anzanello1
1Departamento de Engenharia de Produção, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
2Escola de Administração, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
3Escola de Administração de Empresas de São Paulo, Fundação Getulio Vargas, São Paulo, Brazil

Tóm tắt

Managing customer retention is critical to a company’s profitability and firm value. However, predicting customer churn is challenging. The extant research on the topic mainly focuses on the type of model developed to predict churn, devoting little or no effort to data preparation methods. These methods directly impact the identification of patterns, increasing the model’s predictive performance. We addressed this problem by (1) employing feature engineering methods to generate a set of potential predictor features suitable for the banking industry and (2) preprocessing the majority and minority classes to improve the learning of the classification model pattern. The framework encompasses state-of-the-art data preprocessing methods: (1) feature engineering with recency, frequency, and monetary value concepts to address the imbalanced dataset issue, (2) oversampling using the adaptive synthetic sampling algorithm, and (3) undersampling using NEASMISS algorithm. After data preprocessing, we use XGBoost and elastic net methods for churn prediction. We validated the proposed framework with a dataset of more than 3 million customers and about 170 million transactions. The framework outperformed alternative methods reported in the literature in terms of precision-recall area under curve, accuracy, recall, and specificity. From a practical perspective, the framework provides managers with valuable information to predict customer churn and develop strategies for customer retention in the banking industry.

Tài liệu tham khảo

Ascarza E (2018) Retention futility: targeting high-risk customers might be ineffective. J Mark Res 55:80–98. https://doi.org/10.2139/ssrn.2759170 Ascarza E, Hardie BGS (2013) A joint model of usage and churn in contractual settings. Mark Sci 32:570–590. https://doi.org/10.1287/mksc.2013.0786 Ascarza E, Neslin SA, Netzer O et al (2018) In pursuit of enhanced customer retention management: review, key issues, and future directions. Cust Need Solut 5:65–81. https://doi.org/10.1007/s40547-017-0080-0 Bafna R, Jain R, Malhotra R (2023) A comparative study of classification techniques and imbalanced data treatment for prediction of software faults. Res Sq. https://doi.org/10.21203/rs.3.rs-2809140/v1 Benoit DF, den Poel DV (2012) Improving customer retention in financial services using kinship network information. Expert Syst Appl 39:11435–11442. https://doi.org/10.1016/j.eswa.2012.04.016 Broby D (2021) Financial technology and the future of banking. Financ Innov 7:47. https://doi.org/10.1186/s40854-021-00264-y Broby D (2022) The use of predictive analytics in finance. J Finance Data Sci 8:145–161. https://doi.org/10.1016/j.jfds.2022.05.003 Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, pp 785–794 Chen T, He T, Benesty M, et al (2022) xgboost: Extreme gradient boosting. CRAN R package version 1.6.0.1: https://CRAN.R-project.org/package=xgboost Datta H, Foubert B, Van Heerde HJ (2015) The challenge of retaining customers acquired with free trials. J Mark Res 52:217–234. https://doi.org/10.1509/jmr.12.0160 Dey I, Pratap V (2023) A comparative study of SMOTE, borderline-SMOTE, and ADASYN oversampling techniques using different classifiers. In: 2023 3rd international conference on smart data intelligence (ICSMDI), pp 294–302 Fader PS, Hardie BGS, Lee KL (2005) “Counting your customers” the easy way: an alternative to the pareto/NBD model. Mark Sci 24:275–284. https://doi.org/10.1287/mksc.1040.0098 Farquad MAH, Ravi V, Raju SB (2014) Churn prediction using comprehensible support vector machine: an analytical CRM application. Appl Soft Comput 19:31–40. https://doi.org/10.1016/j.asoc.2014.01.031 Fernandez A, Garcia S, Herrera F, Chawla NV (2018) SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Artif Intell Res 61:863–905. https://doi.org/10.1613/jair.1.11192 Feyen E, Frost J, Gambacorta L et al (2021) Fintech and the digital transformation of financial services: implications for market structure and public policy. BIS Papers 117. https://www.bis.org/publ/bppdf/bispap117.htm Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22 Galar M, Fernandez A, Barrenechea E et al (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C (appl Rev) 42:463–484. https://doi.org/10.1109/TSMCC.2011.2161285 García S, Luengo J, Herrera F (2014) Data preprocessing in data mining. Springer, Berlin Geiler L, Affeldt S, Nadif M (2022) A survey on machine learning methods for churn prediction. Int J Data Sci Anal. https://doi.org/10.1007/s41060-022-00312-5 Gordini N, Veglio V (2017) Customers churn prediction and marketing retention strategies. An application of support vector machines based on the AUC parameter-selection technique in B2B e-commerce industry. Ind Mark Manag 62:100–107. https://doi.org/10.1016/j.indmarman.2016.08.003 Gür Ali Ö, Arıtürk U (2014) Dynamic churn prediction framework with more effective use of rare event data: the case of private banking. Expert Syst Appl 41:7889–7903. https://doi.org/10.1016/j.eswa.2014.06.018 He H, Ma Y (2013) Imbalanced learning: foundations, algorithms, and applications. Wiley, New York He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. Hong Kong He B, Shi Y, Wan Q, Zhao X (2014) Prediction of customer attrition of commercial banks based on SVM model. Procedia Comput Sci 31:423–430. https://doi.org/10.1016/j.procs.2014.05.286 Heldt R, Silveira CS, Luce FB (2021) Predicting customer value per product: from RFM to RFM/P. J Bus Res 127:444–453. https://doi.org/10.1016/j.jbusres.2019.05.001 Huang B, Kechadi MT, Buckley B (2012) Customer churn prediction in telecommunications. Expert Syst Appl 39:1414–1425. https://doi.org/10.1016/j.eswa.2011.08.024 Hvitfeldt E (2022) themis: Extra recipes steps for dealing with unbalanced data. CRAN R package version 1.0.0: https://CRAN.R-project.org/package=themis Jassim MA, Abdulwahid SN (2021) Data mining preparation: process, techniques and major issues in data analysis. IOP Conf Ser: Mater Sci Eng 1090:012053. https://doi.org/10.1088/1757-899X/1090/1/012053 Keramati A, Ghaneei H, Mirmohammadi SM (2016) Developing a prediction model for customer churn from electronic banking services using data mining. Financ Innov 2:10. https://doi.org/10.1186/s40854-016-0029-6 Khoh WH, Pang YH, Ooi SY et al (2023) Predictive churn modeling for sustainable business in the telecommunication industry: optimized weighted ensemble machine learning. Sustainability 15:8631. https://doi.org/10.3390/su15118631 Kou G, Olgu Akdeniz Ö, Dinçer H, Yüksel S (2021a) Fintech investments in European banks: a hybrid IT2 fuzzy multidimensional decision-making approach. Financ Innov 7:39. https://doi.org/10.1186/s40854-021-00256-y Kou G, Xu Y, Peng Y et al (2021b) Bankruptcy prediction for SMEs using transactional data and two-stage multiobjective feature selection. Decis Support Syst 140:113429. https://doi.org/10.1016/j.dss.2020.113429 Kuhn M (2022) tune: Tidy tuning tools. CRAN R package version 1.0.1. https://CRAN.R-project.org/package=tune Kuhn M, Johnson K (2019) Feature engineering and selection: a practical approach for predictive models. CRC Press, Boca Raton Lähteenmäki I, Nätti S (2013) Obstacles to upgrading customer value-in-use in retail banking. Int J Bank Mark 31:334–347. https://doi.org/10.1108/IJBM-11-2012-0109 Lahmiri S, Bekiros S, Giakoumelou A, Bezzina F (2020) Performance assessment of ensemble learning systems in financial data classification. Int J Intell Syst Account Finance Manag 27:3–9. https://doi.org/10.1002/isaf.1460 Lazari N, Machado G (2021) The future of banking: growing digitalization of Brazil’s financial system will foster efficiency and intensify competition. S&P Global Ratings Lemmens A, Croux C (2006) Bagging and boosting classification trees to predict churn. J Mark Res 43:276–286. https://doi.org/10.1509/jmkr.43.2.276 Lemmens A, Gupta S (2020) Managing churn to maximize profits. Mark Sci 39:956–973. https://doi.org/10.1287/mksc.2020.1229 Li T, Kou G, Peng Y, Yu PS (2022) An integrated cluster detection, optimization, and interpretation approach for financial data. IEEE Trans Cybern 52:13848–13861. https://doi.org/10.1109/TCYB.2021.3109066 Lin W-C, Tsai C-F, Hu Y-H, Jhang J-S (2017) Clustering-based undersampling in class-imbalanced data. Inf Sci 409–410:17–26. https://doi.org/10.1016/j.ins.2017.05.008 Livne G, Simpson A, Talmor E (2011) Do customer acquisition cost, retention and usage matter to firm performance and valuation? J Bus Financ Acc 38:334–363. https://doi.org/10.1111/j.1468-5957.2010.02229.x Megahed FM, Chen Y-J, Megahed A et al (2021) The class imbalance problem. Nat Methods 18:1270–1272. https://doi.org/10.1038/s41592-021-01302-4 Murinde V, Rizopoulos E, Zachariadis M (2022) The impact of the FinTech revolution on the future of banking: opportunities and risks. Int Rev Financ Anal 81:102103. https://doi.org/10.1016/j.irfa.2022.102103 Mutanen T, Nousiainen S, Ahola J (2010) Customer churn prediction—a case study in retail banking. In: Data mining for business applications, pp 77–83. https://doi.org/10.3233/978-1-60750-633-1-77 Pousttchi K, Dehnert M (2018) Exploring the digitalization impact on consumer decision-making in retail banking. Electron Markets 28:265–286. https://doi.org/10.1007/s12525-017-0283-0 Pyle D (1999) Data preparation for data mining (The Morgan Kaufmann series in data management systems), Book&CD-ROM 1st. Morgan Kaufmann, Burlington R Core Team (2022) R: a language and environment for statistical computing. R Project. https://www.R-project.org/ Reichheld FF, Sasser WE (1990) Zero defections: quality comes to services. Harvard business review. https://hbr.org/1990/09/zero-defections-quality-comes-to-services Sammut C, Webb GI (eds) (2010) Data preprocessing. In: Encyclopedia of machine learning. Springer, Boston, MA, pp 260–260 Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems 25 (NIPS 2012). NeurIPS proceedings Sofaer HR, Hoeting JA, Jarnevich CS (2019) The area under the precision-recall curve as a performance metric for rare binary events. Methods Ecol Evol 10:565–577. https://doi.org/10.1111/2041-210X.13140 Sun Y, Wong AKC, Kamel MS (2009) Classification of imbalanced data: a review. Int J Patt Recogn Artif Intell 23:687–719. https://doi.org/10.1142/S0218001409007326 Tékouabou SCK, Gherghina ŞC, Toulni H, Neves Mata P, Mata MN, Martins JM (2022) A Machine Learning Framework towards Bank Telemarketing Prediction. J Risk Financ Manag 15:269. https://doi.org/10.3390/jrfm15060269 Triguero I, Derrac J, Garcia S, Herrera F (2012) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans Syst, Man, Cybern C 42:86–100. https://doi.org/10.1109/TSMCC.2010.2103939 Victoria AH, Maragatham G (2021) Automatic tuning of hyperparameters using Bayesian optimization. Evol Syst 12:217–223. https://doi.org/10.1007/s12530-020-09345-2 Weiss GM (2004) Mining with rarity: a unifying framework. SIGKDD Explor Newsl 6:7–19. https://doi.org/10.1145/1007730.1007734 Xie Y, Li X, Ngai EWT, Ying W (2009) Customer churn prediction using improved balanced random forests. Expert Syst Appl 36:5445–5449. https://doi.org/10.1016/j.eswa.2008.06.121 Zhang J, Mani I (2003) KNN Approach to Unbalanced Data Distributions: a case study involving information extraction. In: Proceeding of international conference on machine learning. ICML United States, Washington DC Zhang Y, Bradlow ET, Small DS (2015) Predicting customer value using clumpiness: from RFM to RFMC. Mark Sci 34:195–208. https://doi.org/10.1287/mksc.2014.0873 Zhao J, Dang X-H (2008) Bank Customer churn prediction based on support vector machine: taking a commercial bank’s VIP customer churn as the example. In: 2008 4th international conference on wireless communications, networking and mobile computing. IEEE, Dalian, China, pp 1–4 Zhao H, Zuo X, Xie Y (2022) Customer churn prediction by classification models in machine learning. In: 2022 9th international conference on electrical and electronics engineering (ICEEE). pp 399–407 Zheng A, Casari A (2018) Feature engineering for machine learning: principles and techniques for data scientists. O’Reilly Media, Inc Zhu B, Baesens B, Backiel A, vanden Broucke SKLM (2018) Benchmarking sampling techniques for imbalance learning in churn prediction. J Oper Res Soc 69:49–65. https://doi.org/10.1057/s41274-016-0176-1 Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (stat Methodol) 67:301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x