Stacking-based ensemble learning of decision trees for interpretable prostate cancer detection

Applied Soft Computing - Tập 77 - Trang 188-204 - 2019
Yuyan Wang1, Dujuan Wang2, Na Geng3, Yanzhang Wang1, Yunqiang Yin4, Yaochu Jin1,5
1School of Management Science and Engineering, Dalian University of Technology, Dalian 116023, China
2Business School, Sichuan University, Chengdu 610064, China
3Department of Industrial Engineering & Logistics Management, Shanghai Jiao Tong University, Shanghai 200240, China
4School of Management and Economics, University of Electronic Science and Technology of China, Chengdu 611731, China
5Joint Laboratory for Artificial Intelligence for Precision Medicine, Jiaxing ACCB Diagnostics Ltd, Jiaxing, Zhejiang 314006, China

Tài liệu tham khảo

Torre, 2015, Global cancer statistics, 2012, Ca A Cancer J. Clin., 65, 87, 10.3322/caac.21262 Reda, 2017, A comprehensive non-invasive framework for diagnosing prostate cancer, Comput. Biol. Med., 81, 148, 10.1016/j.compbiomed.2016.12.010 Welch, 2009, Prostate cancer diagnosis and treatment after the introduction of prostate-specific antigen screening: 1986–2005, J. Natl. Cancer Inst., 101, 1325, 10.1093/jnci/djp278 Finne, 2004, Algorithms based on prostate-specific antigen (psa), free psa, digital rectal examination and prostate volume reduce false-postitive psa results in prostate cancer screening, Int. J. Cancer, 111, 10.1002/ijc.20250 Bermejo, 2015, Development of interpretable predictive models for BPH and prostate cancer, Clin. Med. Insights Oncol., 9, 15, 10.4137/CMO.S19739 Kuncheva, 2001, Decision template for multiple classifier fusion: An experimental comparison, Pattern Recognit., 34, 299, 10.1016/S0031-3203(99)00223-X C. Qian, Y. Yu, Z.H. Zhou, Pareto ensemble pruning, in: Proceedings of AAAI Conference on Artificial Intelligence, 2015, pp. 2935–2944. Thompson, 2005, Operating characteristics of prostate-specific antigen in men with an initial PSA level of 3.0 ng/mL or lower, JAMA, 175, 66, 10.1001/jama.294.1.66 Gosselaar, 2008, The role of the digital rectal examination in subsequent screening visits in the European randomized study of screening for prostate cancer (ERSPC), Rotterdam, Eur. Urol., 54, 581, 10.1016/j.eururo.2008.03.104 Catalona, 1994, Comparison of digital rectal examination and serum prostate specific antigen in the early detection of prostate cancer: results of a multicenter clinical trial of 6, 630 men, J. Urol., 151, 1283, 10.1016/S0022-5347(17)35233-3 Nam, 2007, Assessing individual risk for prostate cancer, J. Clin. Oncol., 25, 3582, 10.1200/JCO.2007.10.6450 Ankerst, 2014, Prostate cancer prevention trial risk calculator 2.0 for the prediction of low- vs high-grade prostate cancer, Urology, 83, 1362, 10.1016/j.urology.2014.02.035 Roobol, 2010, A risk-based strategy improves prostate-specific antigen-driven detection of prostate cancer, Eur. Urol., 57, 79, 10.1016/j.eururo.2009.08.025 Çinar, 2009, Early prostate cancer diagnosis by using artificial neural networks and support vector machines, Expert Syst. Appl., 36, 6357, 10.1016/j.eswa.2008.08.010 Sung, 2011, Prostate cancer detection on dynamic contrast-enhanced MRI: computer-aided diagnosis versus single perfusion parameter maps, Am. J. Roentgenol., 197, 1122, 10.2214/AJR.10.6062 D. Albashish, S. Sahran, A. Abdullah, et al. Multi-scoring feature selection method based on SVM-RFE for prostate cancer diagnosis, in: Proceedings of the IEEE International Conference on Electrical Engineering and Informatics, 2015, pp. 682–686. Xiao, 2017, Prostate cancer prediction using the random forest algorithm that takes into account transrectal ultrasound findings, age, and serum levels of prostate-specific antigen, Asian J. Androl., 19, 586, 10.4103/1008-682X.186884 Breiman, 1996, Bagging predictors, Mach. Learn., 24, 123, 10.1007/BF00058655 Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324 Ho, 1998, The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell., 20, 832, 10.1109/34.709601 Y. Freund, Experiments with a new boosting algorithm, in: Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, 1996, pp. 148–156. Tripoliti, 2012, Automated diagnosis of diseases based on classification: dynamic determination of the number of trees in random forests algorithm, IEEE Trans. Inf. Technol. Biomed., 16, 615, 10.1109/TITB.2011.2175938 Tian, 2010, An ensemble ELM based on modified AdaBoost. RT algorithm for predicting the temperature of molten steel in ladle furnace, IEEE Trans. Autom. Sci. Eng., 7, 73, 10.1109/TASE.2008.2005640 Martínez-Muñoz, 2009, An analysis of ensemble pruning techniques based on ordered aggregation, IEEE Trans. Pattern Anal. Mach. Intell., 31, 245, 10.1109/TPAMI.2008.78 Adnan, 2016, Optimizing the number of trees in a decision forest to discover a subforest with high ensemble accuracy using a genetic algorithm, Knowl-Based Syst., 110, 86, 10.1016/j.knosys.2016.07.016 Niu, 2018, A parallel multi-objective particle swarm optimization for cascade hydropower reservoir operation in southwest China, Appl. Soft Comput., 70, 562, 10.1016/j.asoc.2018.06.011 Li, 2016, Artificial bee colony algorithm with memory, Appl. Soft Comput., 41, 362, 10.1016/j.asoc.2015.12.046 Sun, 2018, An ensemble framework for assessing solutions of interval programming problems, Inform. Sci., 436–437, 146, 10.1016/j.ins.2018.01.006 Han, 2019, Evolutionary multiobjective blocking lot-streaming flow shop scheduling with machine breakdowns, IEEE Trans. Cybern., 49, 184, 10.1109/TCYB.2017.2771213 Deb, 2002, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans. Evol. Comput., 6, 182, 10.1109/4235.996017 Feng, 2017, Scheduling of short-term hydrothermal energy system by parallel multi-objective differential evolution, Appl. Soft Comput., 61, 58, 10.1016/j.asoc.2017.07.054 Zhang, 2018, A decomposition-based archiving approach for multi-objective evolutionary optimization, Inform. Sci., 430–431, 397, 10.1016/j.ins.2017.11.052 Gong, 2018, A multi-objective optimization model and its evolution-based solutions for the fingertip localization problem, Pattern Recognit., 74, 385, 10.1016/j.patcog.2017.09.001 Ali, 2015, Can–Evo–Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences, J. Biomed. Inf., 54, 256, 10.1016/j.jbi.2015.01.004 Wolpert, 1992, Stacked generalization, Neural Netw., 5, 241, 10.1016/S0893-6080(05)80023-1 Nguyen, 2017, Heterogeneous classifier ensemble with fuzzy rule-based meta learner, Inform. Sci. Nguyen, 2016, A novel combining classifier method based on variational inference, Pattern Recognit., 49, 198, 10.1016/j.patcog.2015.06.016 Aburomman, 2017, A survey of intrusion detection systems based on ensemble and hybrid classifiers, Comput. Secur., 65, 135, 10.1016/j.cose.2016.11.004 Ting, 1999, Issues in stacked generalization, J. Artificial Intelligence Res., 10, 271, 10.1613/jair.594 Todorovski, 2003, Combining classifiers with meta decision trees, Mach. Learn., 50, 223, 10.1023/A:1021709817809 Du, 2012, Knowledge extraction algorithm for variances handling of CP using integrated hybrid genetic double multi-group cooperative PSO and DPSO, J. Med. Syst., 36, 979, 10.1007/s10916-010-9562-4 N. Sirikulviriya, S. Sinthupinyo, Integration of rules from a random forest, in: Proceedings of International Conference on Information and Electronics Engineering, 2011, pp. 194–198. Mashayekhi, 2015, Rule extraction from random forest: the RF+HC methods, 223 Mashayekhi, 2017, Rule extraction from decision trees ensembles: new algorithms based on heuristic search and sparse group lasso methods, Int. J. Inf. Technol. Decis. Mak., 16, 1707, 10.1142/S0219622017500055 T.K.P. Lu, V.T.N. Chau, N.H. Phung, Extracting rule RF in educational data classification: from a random forest to interpretable refined rules, in: Proceedings of International Conference on Advanced Computing and Applications, 2015, pp. 20–27. Breiman, 1984, Classification and regression trees (CART), Encycl. Ecol., 40, 582 Rudziński, 2016, A multi-objective genetic optimization of interpretability-oriented fuzzy rule-based classifiers, Appl. Soft Comput., 38, 118, 10.1016/j.asoc.2015.09.038 Gorzałczany, 2016, A multi-objective genetic optimization for fast, fuzzy rule-based credit classification with balanced accuracy and interpretability, Appl. Soft Comput., 40, 206, 10.1016/j.asoc.2015.11.037 Chou, 2014, A multiobjective hybrid genetic algorithm for TFT-LCD module assembly scheduling, IEEE Trans. Autom. Sci. Eng., 11, 692, 10.1109/TASE.2014.2316193 Ding, 2018, A bi-objective load balancing model in a distributed simulation system using NSGA-II and MOPSO approaches, Appl. Soft Comput., 63, 249, 10.1016/j.asoc.2017.09.012 Breiman, 1996, Stacked regressions, Mach. Learn., 24, 49, 10.1007/BF00117832 Li, 2018, Niching genetic network programming with rule accumulation for decision making: an evolutionary rule-based approach, Expert Syst. Appl., 114, 374, 10.1016/j.eswa.2018.07.041 Jin, 2008, Pareto-based multiobjective machine learning: an overview and case studies, IEEE Trans. Syst. Man Cybern. C, 38, 397, 10.1109/TSMCC.2008.919172 Gu, 2015, Multi-objective ensemble generation, WIREs Data Mining Knowl. Discov., 5, 234, 10.1002/widm.1158 A. Madabhushi, J. Shi, M. Feldman, et al. Comparing ensembles of learners: detecting prostate cancer from high resolution MRI, in: International Workshop on Computer Vision Approaches to Medical Image Analysis, 2006, pp. 25–36. Bonab, 2017, Less is more: A comprehensive framework for the number of components of ensemble classifiers, IEEE Trans. Neural Netw. Learn. Syst., 14, 1403