LSTM Response Models for Direct Marketing Analytics: Replacing Feature Engineering with Deep Learning

Journal of Interactive Marketing - Tập 53 - Trang 80-95 - 2021
Mainak Sarkar1, Arnaud De Bruyn1
1ESSEC Business School, Cergy, France

Tài liệu tham khảo

Agrawal, 1996, Market share forecasting: An empirical comparison of artificial neural networks and multinomial logit model, Journal of Retailing, 72, 383, 10.1016/S0022-4359(96)90020-2 Ascarza, 2016, The perils of proactive churn prevention using plan recommendations: Evidence from a field experiment, Journal of Marketing Research, 53, 46, 10.1509/jmr.13.0483 Bahdanau, 2014, Neural machine translation by jointly learning to align and translate, arXiv preprint Basu, 1995, Modeling the response pattern to direct marketing campaigns, Journal of Marketing Research, 32, 204, 10.1177/002224379503200207 Bengio, 2012, Practical recommendations for gradient-based training of deep architectures, 437 Bengio, 1994, Learning long-term dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks, 5, 157, 10.1109/72.279181 Ben Bitran, 1996, Mailing decisions in the catalog sales industry, Management Science, 42, 1364, 10.1287/mnsc.42.9.1364 Bjorck, 2018, Understanding batch normalization Blattberg, 2008 Breiman, 2001, Random forests, Machine Learning, 45, 5, 10.1023/A:1010933404324 Brown, 2001, Products of hidden Markov models Bult, 1995, Optimal selection for direct mail, Marketing Science, 14, 378, 10.1287/mksc.14.4.378 Cho, 2014, Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv preprint Chung, 2014, Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv preprint Colombo, 1999, A stochastic RFM model, Journal of Interactive Marketing, 13, 2, 10.1002/(SICI)1520-6653(199922)13:3<2::AID-DIR1>3.0.CO;2-H Coussement, 2013, Customer churn prediction in the online gambling industry: The beneficial effect of ensemble learning, Journal of Business Research, 66, 1629, 10.1016/j.jbusres.2012.12.008 De Bruyn, 2020, Artificial intelligence and marketing: Pitfalls and opportunities, Journal of Interactive Marketing, 10.1016/j.intmar.2020.04.007 Dong, 2018 Donkers, 2006, Deriving target selection rules from endogenously selected samples, Journal of Applied Econometrics, 21, 549, 10.1002/jae.858 Elsner, 2004, Optimizing Rhenania's direct marketing business through dynamic multilevel modeling (DMLM) in a multicatalog-brand environment, Marketing Science, 192, 10.1287/mksc.1040.0063 Fader, 2005, RFM and CLV: Using iso-value curves for customer base analysis, Journal of Marketing Research, 42, 415, 10.1509/jmkr.2005.42.4.415 Friedman, 2009, glmnet: Lasso and elastic-net regularized generalized linear models Gal, 2016, Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 1050 George, 2013, Maximizing profits for a multi-category catalog retailer, Journal of Retailing, 89, 374, 10.1016/j.jretai.2013.05.001 Gers, 1999 Gönül, 1998, Optimal mailing of catalogs: A new methodology using estimable structural dynamic programming models, Management Science, 44, 1249, 10.1287/mnsc.44.9.1249 Gönül, 2006, How to compute optimal catalog mailing decisions, Marketing Science, 25, 65, 10.1287/mksc.1050.0136 Gönül, 2000, Mailing smarter to catalog customers, Journal of Interactive Marketing, 14, 2, 10.1002/(SICI)1520-6653(200021)14:2<2::AID-DIR1>3.0.CO;2-N Goodfellow, 2016 Graves, 2009, Offline handwriting recognition with multidimensional recurrent neural networks, In Advances in neural information processing systems Graves, 2014, Neural turing machines, arXiv preprint Greff, 2016, LSTM: A search space odyssey, IEEE Transactions on Neural Networks and Learning Systems, 28, 2222, 10.1109/TNNLS.2016.2582924 Guadagni, 1983, A logit model of brand choice calibrated on scanner data, Marketing Science, 2, 203, 10.1287/mksc.2.3.203 Hinton Hochreiter, 1997, Long short-term memory, Neural Computation, 9, 1735, 10.1162/neco.1997.9.8.1735 Kingma, 2014, Adam: A method for stochastic optimization, arXiv preprint Kuhn, 2013 Kuhn, 2019 Lemmens, 2006, Bagging and boosting classification trees to predict churn, Journal of Marketing Research, 43, 276, 10.1509/jmkr.43.2.276 Lilien, 2013 Ling, 1998, Data mining for direct marketing: Problems and solutions, 98, 73 Luong, 2015, Effective approaches to attention-based neural machine translation, arXiv preprint Malthouse, 1999, Ridge regression and direct marketing scoring models, Journal of Interactive Marketing, 13, 10, 10.1002/(SICI)1520-6653(199923)13:4<10::AID-DIR2>3.0.CO;2-3 Martínez, 2020, A machine learning framework for customer purchase prediction in the non-contractual setting, European Journal of Operations Research, 281, 588, 10.1016/j.ejor.2018.04.034 Michelucci, 2018 Ming, 2017, Understanding hidden memories of recurrent neural networks, 13 Moe, 2004, Capturing evolving visit behavior in clickstream data, Journal of Interactive Marketing, 18, 5, 10.1002/dir.10074 Moe, 2004, Dynamic conversion behavior at e-commerce sites, Management Science, 50, 326, 10.1287/mnsc.1040.0153 Neslin, 2006, Defection detection: Measuring and understanding the predictive accuracy of customer churn models, Journal of Marketing Research, 43, 204, 10.1509/jmkr.43.2.204 Netzer, 2008, A hidden Markov model of customer relationship dynamics, Marketing Science, 27, 185, 10.1287/mksc.1070.0294 Olah, 2015 Oshiro, 2012, How many trees in a random forest?, 154 Park, 2004, Modeling browsing behavior at multiple websites, Marketing Science, 23, 280, 10.1287/mksc.1040.0050 Pascanu, 2013, On the difficulty of training recurrent neural networks, 28, 1310 Pointer, 2019 Roberts, 1999 Rudin, 2019, Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nature Machine Intelligence, 1, 206, 10.1038/s42256-019-0048-x Rumelhart, 1986, Learning representations by back-propagating errors, Nature, 323, 533, 10.1038/323533a0 Saleh, 2018 Schweidel, 2013, Incorporating direct marketing activity into latent attrition models, Marketing Science, 32, 471, 10.1287/mksc.2013.0781 Shahriari, 2015, Taking the human out of the loop: A review of Bayesian optimization, Proceedings of the IEEE, 104, 148, 10.1109/JPROC.2015.2494218 Siami-Namini, 2019, The performance of LSTM and BiLSTM in forecasting time series, 3285 Simester, 2006, Dynamic catalog mailing policies, Management Science, 52, 683, 10.1287/mnsc.1050.0504 Srivastava, 2014, Dropout: A simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15, 1929 Tieleman, 2012, Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude, COURSERA: Neural Networks for Machine Learning, 4, 26 Van Diepen, 2009, Dynamic and competitive effects of direct mailings: A charitable giving application, Journal of Marketing Research, 46, 120, 10.1509/jmkr.46.1.120 Vaswani, 2017, Attention is all you need, 5998 Zheng, 2018 Zou, 2005, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301, 10.1111/j.1467-9868.2005.00503.x