A rule extraction approach from support vector machines for diagnosing hypertension among diabetics

Expert Systems with Applications - Tập 130 - Trang 188-205 - 2019
Namrata Singh1, Pradeep Singh1, Deepika Bhagat2
1Department of Computer Science and Engineering, National Institute of Technology, Raipur 492001, Chhattisgarh, India
2Department of Medicine, Dr. Bhim Rao Ambedkar Memorial Hospital, Pt. JNM, Medical College, Raipur 492001, Chhattisgarh, India

Tài liệu tham khảo

Alberti, 1998, Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: Diagnosis and classification of diabetes mellitus. Provisional report of a WHO consultation, Diabetic Medicine, 15, 539, 10.1002/(SICI)1096-9136(199807)15:7<539::AID-DIA668>3.0.CO;2-S Andrews, 1995, Survey and critique of techniques for extracting rules from trained artificial neural networks, Knowledge-Based Systems, 8, 373, 10.1016/0950-7051(96)81920-4 Asaduzzaman, 2018, Dataset on significant risk factors for Type 1 diabetes: A Bangladeshi perspective, Data in Brief, 21, 700, 10.1016/j.dib.2018.10.018 Barakat, 2006, Rule extraction from support vector machines: Measuring the explanation capability using the area under the ROC curve, 2, 812 Barakat, 2005, Eclectic rule-extraction from support vector machines, International Journal of Computational Intelligence, 2, 59 Barakat, 2007, Rule extraction from support vector machines: A sequential covering approach, IEEE Transactions on Knowledge and Data Engineering, 19, 729, 10.1109/TKDE.2007.190610 Barakat, 2010, Intelligible support vector machines for diagnosis of diabetes mellitus, IEEE Transactions on Information Technology in Biomedicine, 14, 1114, 10.1109/TITB.2009.2039485 Biostat Diabetes Dataset. (2018). Retrieved January 20, 2019, from http://biostat.mc.vanderbilt.edu/wiki/Main/DataSets Boyd, 2004 Brown, 2009, A new perspective for information theoretic feature selection, 49 Chen, 2018, Diabetes classification model based on boosting algorithms, BMC Bioinformatics, 19, 109, 10.1186/s12859-018-2090-9 Chen, 2016, XGBoost: A scalable tree boosting system, 785 Cho, 2018, IDF diabetes atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045, Diabetes Research and Clinical Practice, 138, 311, 10.1016/j.diabres.2018.02.023 Cohen, 1995, Fast effective rule induction, 115 Cortes, 1995, Support-vector networks, Machine Learning, 20, 273, 10.1007/BF00994018 Cover, 2006 Craven, 1996, Extracting tree-structured representations of trained neural networks, 8, 24 de Fortuny, 2015, Active learning-based pedagogical rule extraction, IEEE Transactions on Neural Networks and Learning Systems, 26, 2664, 10.1109/TNNLS.2015.2389037 Demšar, 2006, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1 Dheeru, D., & Karra Taniskidou, E. (2017). UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml Diederich, 2008, Rule Extraction from Support Vector Machines: An Introduction, 80, 3 Diederich, 2004, Hybrid rule-extraction from support vector machines Ding, 2014, An overview on nonparallel hyperplane support vector machine algorithms, Neural Computing and Applications, 25, 975, 10.1007/s00521-013-1524-6 Dunn, 1961, Multiple comparisons among means, Journal of the American Statistical Association, 56, 52, 10.1080/01621459.1961.10482090 Farquad, 2008, Rule extraction using Support Vector Machine based hybrid classifier, 1 Farquad, 2009, Support Vector Machine based Hybrid Classifiers and Rule Extraction thereof: Application to Bankruptcy Prediction in Banks, 404 Farquad, 2010, Support vector regression based hybrid rule extraction methods for forecasting, Expert Systems with Applications, 37, 5577, 10.1016/j.eswa.2010.02.055 Farran, 2013, Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: Machine-learning algorithms and validation using national health data from Kuwait—a cohort study, BMJ Open, 3, 10.1136/bmjopen-2012-002457 Feld, 2002, The American association of clinical endocrinologists medical guidelines for the management of diabetes mellitus: The AACE system of intensive diabetes self-management - 2002 update, Endocrine Practice, 8, 40, 10.4158/EP.8.S1.40 Frank, 1998, Generating accurate rule sets without global optimization, 144 Friedman, 2001, Greedy function approximation: A gradient boosting machine, The Annals of Statistics, 29, 1189, 10.1214/aos/1013203451 Friedman, 2002, Stochastic gradient boosting, Computational Statistics and Data Analysis, 38, 367, 10.1016/S0167-9473(01)00065-2 Friedman, 1940, A comparison of alternative tests of significance for the problem of m rankings, The Annals of Mathematical Statistics, 11, 86, 10.1214/aoms/1177731944 Fu, 2004, Extracting the knowledge embedded in support vector machines, 291 Fung, 2005, Rule extraction from linear support vector machines, 32 Geng, 2006, Interestingness measures for data mining, ACM Computing Surveys (CSUR), 38, 10.1145/1132960.1132963 Gorzałczany, 2017, Interpretable and accurate medical data classification –A multi-objective genetic-fuzzy optimization approach, Expert Systems with Applications, 71, 26, 10.1016/j.eswa.2016.11.017 Guo, 2017, Cluster analysis: A new approach for identification of underlying risk factors for coronary artery disease in essential hypertensive patients, Scientific Reports, 7, 43965, 10.1038/srep43965 Guzman, 2017, Design of an optimized fuzzy classifier for the diagnosis of blood pressure with a new computational method for expert rule optimization, Algorithms, 10, 79, 10.3390/a10030079 Guzmán, 2018, Fuzzy optimized classifier for the diagnosis of blood pressure using genetic algorithm, 309 Han, 2015, Rule extraction from support vector machines using ensemble learning approach: An application for diagnosis of diabetes, IEEE Journal of Biomedical and Health Informatics, 19, 728, 10.1109/JBHI.2014.2325615 Hayashi, 2016, Rule extraction using Recursive-Rule extraction algorithm with J48graft combined with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the Pima Indian dataset, Informatics in Medicine Unlocked, 2, 92, 10.1016/j.imu.2016.02.001 Hsu, 2002, A simple decomposition method for support vector machines, Machine Learning, 46, 291, 10.1023/A:1012427100071 International Diabetes Federation IDF Diabetes Atlas-8th Edition. (2017). Retrieved June 7, 2018, from http://www.diabetesatlas.org/ Jayanthi, 2017, Survey on clinical prediction models for diabetes prediction, Journal of Big Data, 4, 26, 10.1186/s40537-017-0082-7 Jin, 2006, Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles, 106 Johansson, 2003, Rule extraction from trained neural networks using genetic programming, 13 Johansson, 2004, The truth is in there-rule extraction from opaque models using genetic programming, 658 Kahramanli, 2009, Extracting rules for classification problems: AIS based approach, Expert Systems with Applications, 36, 10494, 10.1016/j.eswa.2009.01.029 Katayama, 2018, Clinical features and therapeutic perspectives on hypertension in diabetics, Hypertension Research, 41, 213, 10.1038/s41440-017-0001-5 Kavakiotis, 2017, Machine learning and data mining methods in diabetes research, Computational and Structural Biotechnology Journal, 15, 104, 10.1016/j.csbj.2016.12.005 Ketema, 2015, Correlation of fasting and postprandial plasma glucose with HbA1c in assessing glycemic control; systematic review and meta-analysis, Archives of Public Health, 73, 43, 10.1186/s13690-015-0088-6 Kurano, 2018, mRNA expression of platelet activating factor receptor (PAFR) in peripheral blood mononuclear cells is associated with albuminuria and vascular dysfunction in patients with type 2 diabetes, Diabetes Research and Clinical Practice, 136, 124, 10.1016/j.diabres.2017.11.028 LaFreniere, 2016, Using machine learning to predict hypertension from a clinical dataset, 1 Liu, 2013, Prevalence of diabetes mellitus in outpatients with essential hypertension in China: A cross-sectional study, BMJ Open, 3, 10.1136/bmjopen-2013-003798 Lopez-jaramillo, 2014, The goal of blood pressure in the hypertensive patient with diabetes is defined: Now the challenge is go from recommendations to practice, Diabetology & Metabolic Syndrome, 6, 31, 10.1186/1758-5996-6-31 Louppe, G., Wehenkel, L., Sutera, A., & Geurts, P. (2013). Understanding variable importances in forests of randomized trees. Proceedings of the Advances in neural information processing systems. Retrieved from NIPS2013_4928 Luo, 2016, Automatically explaining machine learning prediction results: A demonstration on type 2 diabetes risk prediction, Health Information Science and Systems, 4, 10.1186/s13755-016-0015-4 Malmir, 2017, A medical decision support system for disease diagnosis under uncertainty, Expert Systems with Applications, 88, 95, 10.1016/j.eswa.2017.06.031 Maniruzzaman, 2017, Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm, Computer Methods and Programs in Biomedicine, 152, 23, 10.1016/j.cmpb.2017.09.004 Marling, 2012, Emerging applications for intelligent diabetes management, AI Magazine, 33, 67, 10.1609/aimag.v33i2.2410 Martens, 2009, Decompositional rule extraction from support vector machines by active learning, IEEE Transactions on Knowledge and Data Engineering, 21, 178, 10.1109/TKDE.2008.131 Melin, 2018, A hybrid model based on modular neural networks and fuzzy systems for classification of blood pressure and hypertension risk diagnosis, Expert Systems with Applications, 107, 146, 10.1016/j.eswa.2018.04.023 Menze, 2009, A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data, BMC Bioinformatics, 10, 213, 10.1186/1471-2105-10-213 Miramontes, 2017, A Hybrid Intelligent System Model for Hypertension Diagnosis, 541 Miramontes, 2018, A hybrid intelligent system model for hypertension risk diagnosis, 648, 202 Nanditha, 2016, Diabetes in Asia and the Pacific: Implications for the global epidemic, Diabetes Care, 39, 472, 10.2337/dc15-1536 Núñez, 2002, Rule extraction from support vector machines, 107 Núñez, H., Angulo, C., & Català, A. (2004). Rule Based Learning Systems from SVM and RBFNN. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.144.2476 Ogurtsova, 2017, IDF Diabetes Atlas: Global estimates for the prevalence of diabetes for 2015 and 2040, Diabetes Research and Clinical Practice, 128, 40, 10.1016/j.diabres.2017.03.024 Pourpanah, 2016, A hybrid model of fuzzy ARTMAP and genetic algorithm for data classification and rule extraction, Expert Systems with Applications, 49, 74, 10.1016/j.eswa.2015.11.009 Pulido, 2018, A new model based on a fuzzy system for arterial hypertension classification, 319 Quinlan, 1986, Induction of decision trees, Machine Learning, 1, 81, 10.1007/BF00116251 Rodbard, 2011, Design of a decision support system to help clinicians manage glycemia in patients with type 2 diabetes mellitus, Journal of Diabetes Science and Technology, 5, 402, 10.1177/193229681100500230 Sakr, 2018, Using machine learning on cardiorespiratory fitness data for predicting hypertension: The Henry Ford ExercIse Testing (FIT) Project, PloS One, 13, 10.1371/journal.pone.0195344 Schmitz, 1999, ANN-DT: An algorithm for extraction of decision trees from artificial neural networks, IEEE Transactions on Neural Networks, 10, 1392, 10.1109/72.809084 Seera, 2014, A hybrid intelligent system for medical data classification, Expert Systems with Applications, 41, 2239, 10.1016/j.eswa.2013.09.022 Sim, 2017, Development of a clinical decision support system for diabetes care: A pilot study, PloS One, 12, 10.1371/journal.pone.0173021 Sowers, 2001, Diabetes, hypertension, and cardiovascular disease: An update, Hypertension, 37, 1053, 10.1161/01.HYP.37.4.1053 Steinwart, 2003, Sparseness of support vector machines, Journal of Machine Learning Research, 4, 1071 Steinwart, 2006, An explicit description of the reproducing kernel Hilbert spaces of Gaussian RBF kernels, IEEE Transactions on Information Theory, 52, 4635, 10.1109/TIT.2006.881713 Stoean, 2013, Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection, Expert Systems with Applications, 40, 2677, 10.1016/j.eswa.2012.11.007 Teramukai, 2016, Dynamic prediction model and risk assessment chart for cardiovascular disease based on on-treatment blood pressure and baseline risk factors, Hypertension Research, 39, 113, 10.1038/hr.2015.120 Tickle, 1998, The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks, IEEE Transactions on Neural Networks, 9, 1057, 10.1109/72.728352 Tickle, 1994, DEDEC: Decision detection by rule extraction from neural networks, Neurocomputing Research Center QUT NRC van Dieren, 2012, Prediction models for the risk of cardiovascular disease in patients with type 2 diabetes: A systematic review, Heart, 98, 360, 10.1136/heartjnl-2011-300734 Vapnik, 1995 Wang, 2015, Detection of epileptic seizures in EEG signals with rule-based interpretation by random forest approach, 738 Wang, 2014, Super-parameter selection for Gaussian-Kernel SVM based on outlier-resisting, Measurement, 58, 147, 10.1016/j.measurement.2014.08.019 2011 World Health Organization. (2013). A global brief on hypertension: Silent killer, global public health crisis: World Health Day 2013. Retrieved from http://apps.who.int/iris/handle/10665/79059 Zhang, 2004, DRC-BK: Mining classification rules with help of SVM, 191 Zhang, 2005, Rule extraction from trained support vector machines, 61