Predicting direct marketing response in banking: comparison of class imbalance methods

Springer Science and Business Media LLC - Tập 11 Số 4 - Trang 831-849 - 2017
Vera L. Miguéis1, Ana S. Camanho1, José Borges1
1Faculdade de Engenharia, Universidade do Porto, Porto, Portugal

Tóm tắt

Từ khóa


Tài liệu tham khảo

Abroud A, Choong YV, Muthaiyah S, Fie DYG (2015) Adopting e-finance: decomposing the technology acceptance model for investors. Serv Bus 9(1):161–182

Alpaydin E (2009) Introduction to machine learning, 2nd edn. The MIT Press, Cambridge

American Banker (2012) Customer analytics growing in banks. http://www.americanbanker.com/btn/25_11/customer-analytics-growing-in-banks-1053866-1.html

Amini M, Rezaeenour J, Hadavandi E (2015) A cluster-based data balancing ensemble classifier for response modeling in Bank Direct Marketing. Int J Comput Intell Appl 14(04):1550,022. doi: 10.1142/S1469026815500224

Ansari A, Mela CF, Neslin SA (2008) Customer channel migration. J Mark Res 45(1):60–76. doi: 10.1509/jmkr.45.1.60

Ayetiran EF, Adeyemo AB (2012) A data mining-based response model for target selection in direct marketing. IJ Inf Technol Comput Sci 1:9–18

Baesens B, Viaene S, Van den Poel D, Vanthienen J, Dedene G (2002) Bayesian neural network learning for repeat purchase modelling in direct marketing. Eur J Oper Res 138(1):191–211

Ben Ishak A (2016) Variable selection using support vector regression and random forests: a comparative study. Intell Data Anal 20(1):83–104. doi: 10.3233/IDA-150795

Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

Burez J, Van den Poel D (2009) Handling class imbalance in customer churn prediction. Expert Syst Appl 36:4626–4636

Burton SH, Morris RG, Giraud-Carrier CG, West JH, Thackeray R (2014) Mining useful association rules from questionnaire data. Intell Data Anal 18(3):479–494. doi: 10.3233/IDA-140652

Chan KY, Loh WY (2004) LOTUS: an algorithm for building accurate and comprehensible logistic regression trees. J Comput Graph Stat 13(4):826–852. doi: 10.1198/106186004X13064

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16(1):321–357

Chen WC, Hsu CC, Hsu JN (2011) Optimal selection of potential customer range through the union sequential pattern by using a response model. Expert Syst Appl 38(6):7451–7461. doi: 10.1016/j.eswa.2010.12.078

Chen K, Hu YH, Hsieh YC (2014) Predicting customer churn from valuable B2B customers in the logistics industry: a case study. Inf Syst e-Bus Manag 13(3):475–494. doi: 10.1007/s10257-014-0264-1

Chih WH, Liou DK, Hsu LC (2014) From positive and negative cognition perspectives to explore e-shoppers real purchase behavior: an application of tricomponent attitude model. Inf Syst e-Business Manag 13(3):495–526. doi: 10.1007/s10257-014-0249-0

Cohen MD (2004) Exploiting response models optimizing cross-sell and up-sell opportunities in banking. Inf Syst 29(4):327–341. doi: 10.1016/j.is.2003.08.001

Direct Marketing Association (2012) What is the direct marketing association? http://www.the-dma.org/aboutdma/whatisthedma.shtml

Elsalamony H, Elsayad A (2013) Bank direct marketing based on neural network. Int J Eng Adv Technol 2(6):392–400

Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Saitta L (ed) Proceedings of the thirteenth international conference on machine learning (ICML 1996), Morgan Kaufmann, pp 148–156

Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232. doi: 10.1214/aos/1013203451

Garca S, Herrera F (2009) Evolutionary undersampling for classification with imbalanced datasets: proposals and taxonomy. Evol Comput 17(3):275–306. doi: 10.1162/evco.2009.17.3.275

Garca-Pedrajas N, Ortiz-Boyer D, Garca-Pedrajas MD, Fyfe C (2012) Class imbalance methods for translation initiation site recognition. In: Garca-Pedrajas N, Herrera F, Fyfe C, Bentez JM, Ali M (eds) Trends in applied intelligent systems, no. 6096 in lecture notes in computer science. Springer, Berlin, pp 327–336

Govindarajan M (2015) Comparative study of ensemble classifiers for direct marketing. Int Dec Tech 9(2):141–152. doi: 10.3233/IDT-140212

Gür Ali Ö, Aritürk U (2014) Dynamic churn prediction framework with more effective use of rare event data: the case of private banking. Expert Syst Appl 41(17):7889–7903. doi: 10.1016/j.eswa.2014.06.018

Gzquez-Abad JC, Cannire MHD, Martnez-Lpez FJ (2011) Dynamics of customer response to promotional and relational direct mailings from an apparel retailer: The moderating role of relationship strength. J Retail 87(2):166–181. doi: 10.1016/j.jretai.2011.03.001

Ha K, Cho S, MacLachlan D (2005) Response models based on bagging neural networks. J Interact Market 19(1):17–30. doi: 10.1002/dir.20028

Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann, Amsterdam

He H (2011) Self-adaptive systems for machine intelligence. John Wiley & Sons, New Jersey

Hosseini SY, Bideh AZ (2014) A data mining approach for segmentation-based importance-performance analysis (SOM-BPNN-IPA): a new framework for developing customer retention strategies. Serv Bus 8(2):295–312. doi: 10.1007/s11628-013-0197-7

Hsieh NC (2004) An integrated data mining and behavioral scoring model for analyzing bank customers. Expert Syst Appl 27(4):623–633. doi: 10.1016/j.eswa.2004.06.007

Hu X (2005) A data mining approach for retailing bank customer attrition analysis. Appl Intell 22(1):47–60. doi: 10.1023/B:APIN.0000047383.53680.b6

Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449

Jayasree V (2013) A review on data mining in banking sector. Am J Appl Sci 10(10):1160–1165. doi: 10.3844/ajassp.2013.1160.1165

Jingbiao R, Shaohong Y (2010) Research and improvement of clustering algorithm in data mining. In: 2010 2nd international conference on signal processing systems (ICSPS), vol 1, pp 842–845, doi:DOIurl10.1109/ICSPS.2010.5555239

Khajvand M, Tarokh MJ (2011) Estimating customer future value of different customer segments based on adapted RFM model in retail banking context. Procedia Comput Sci 3:1327–1332. doi: 10.1016/j.procs.2011.01.011

Kim G, Chae BK, Olson DL (2013) A support vector machine (SVM) approach to imbalanced datasets of customer responses: comparison with other customer response models. Serv Bus 7(1):167–182. doi: 10.1007/s11628-012-0147-9

Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496. doi: 10.1109/TSE.2008.35

Li W, Wu X, Sun Y, Zhang Q (2010) Credit card customer segmentation and target marketing based on data mining. In: 2010 international conference on computational intelligence and security (CIS), pp 73–76, doi:DOIurl10.1109/CIS.2010.23

Liao SH, Chen CM, Hsieh CL, Hsiao SC (2009) Mining information users’ knowledge for one-to-one marketing on information appliance. Expert Syst Appl 36(3):4967–4979. doi: 10.1016/j.eswa.2008.06.020

Libana-Cabanillas F, Nogueras R, Herrera LJ, Guilln A (2013) Analysing user trust in electronic banking using data mining methods. Expert Syst Appl 40(14):5439–5447. doi: 10.1016/j.eswa.2013.03.010

Ling CX, Li C (1998) Data mining for direct marketing: Problems and solutions. In: Knowledge discovery and data mining, pp 217–225

Liu XY, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern Part B 39(2):539–550. doi: 10.1109/TSMCB.2008.2007853

Lu MT, Tzeng GH, Cheng H, Hsu CC (2015) Exploring mobile banking services for user behavior in intention adoption: using new hybrid MADM model. Serv Bus 9(3):541–565. doi: 10.1007/s11628-014-0239-9

Mcculloch W, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(133):115

Migueis VL, Benoit DF, Van den Poel D (2013) Enhanced decision support in credit scoring using bayesian binary quantile regression. J Oper Res Soc 64(9):1374–1383. doi: 10.1057/jors.2012.116

Moro S, Cortez P, Rita P (2014) A data-driven approach to predict the success of bank telemarketing. Decis Support Syst 62:22–31. doi: 10.1016/j.dss.2014.03.001

Ngai E, Xiu L, Chau D (2009) Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst Appl 36(2, Part 2):2592–2602

Nie G, Rowe W, Zhang L, Tian Y, Shi Y (2011) Credit card churn forecasting by logistic regression and decision tree. Expert Syst Appl 38(12):15,273–15,285

Olson DL, Chae B (2012) Direct marketing decision support through predictive customer response modeling. Decis Support Syst 54(1):443–451. doi: 10.1016/j.dss.2012.06.005

Olson DL, Cao Q, Gu C, Lee D (2009) Comparison of customer response models. Serv Bus 3(2):117–130. doi: 10.1007/s11628-009-0064-8

Quah JTS, Sriganesh M (2008) Real-time credit card fraud detection using computational intelligence. Expert Syst Appl 35(4):1721–1732. doi: 10.1016/j.eswa.2007.08.093

Ras ZW, Wieczorkowska A (2000) Action-rules: how to increase profit of a company. In: Zighed DA, Komorowski J, Zytkow J (eds) Principles of data mining and knowledge discovery, no. 1910 in lecture notes in computer science. Springer, Berlin, pp 587–592

Ratner B (2004) Statistical modeling and analysis for database marketing: effective techniques for mining big data. CRC Press, Boca Raton

Schwartz B, Lauridsen JT (2007) Scoring of bank customers for a life insurance campaign. Technical Report 5/2007, University of Southern Denmark, Denmark

Seret A, Bejinaru A, Baesens B (2015) Domain knowledge based segmentation of online banking customers. Intell Data Anal 19:163–184. doi: 10.3233/IDA-150776

Srinivas K, Rao GR, Govardhan A (2014) Adapting rough-fuzzy classifier to solve class imbalance problem in heart disease prediction using FCM. Int J Med Eng Inform 6(4):297–318. doi: 10.1504/IJMEI.2014.065427

Sun B, Li S, Zhou C (2006) “Adaptive” learning and “proactive” customer relationship management. J Interact Market 20(3–4):82–96. doi: 10.1002/dir.20069

Sun Y, Kamel MS, Wong AKC, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit 40(12):3358–3378. doi: 10.1016/j.patcog.2007.04.009

Vapnik VN (1995) The nature of statistical learning theory. Springer, New York

Verhoef PC, Spring PN, Hoekstra JC, Leeflang PS (2003) The commercial use of segmentation and predictive modeling techniques for database marketing in the Netherlands. Decis Support Syst 34(4):471–481

Vriens M, Van der Scheer HR, Hoekstra JC, Bult JR (1998) Conjoint experiments for direct mail response optimization. Eur J Market 32(3/4):323–339. doi: 10.1108/03090569810204625

Wang YY, Luse A, Townsend AM, Mennecke BE (2014) Understanding the moderating roles of types of recommender systems and products on customer behavioral intention to use recommender systems. Inf Syst e-Bus Manag 13(4):769–799. doi: 10.1007/s10257-014-0269-9

Xiong T, Wang S, Mayers A, Monga E (2013) Personal bankruptcy prediction by mining credit card data. Expert Syst Appl 40(2):665–676. doi: 10.1016/j.eswa.2012.07.072

Yeh IC, Lien CH (2009) The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst Appl 36(2, Part 1):2473–2480. doi: 10.1016/j.eswa.2007.12.020

Yen SJ, Lee YS (2009) Cluster-based under-sampling approaches for imbalanced data distributions. Expert Syst Appl 36(3):5718–5727. doi: 10.1016/j.eswa.2008.06.108

Zarnani A, Rahgozar M, Lucas C, Taghiyareh F (2009) Effective spatial clustering methods for optimal facility establishment. Intell Data Anal 13(1):61–84