Feature selection model for healthcare analysis and classification using classifier ensemble technique

Senthil Murugan Nagarajan1, V. Muthukumaran2, R. Murugesan2, Rose Bindu Joseph3, Meram Munirathanam4
1School of Computer Science and Engineering, VIT-AP University, Amaravati, India
2Department of Mathematics, School of Applied Sciences, REVA University, Bangalore, India
3Department of Mathematics, Christ Academy Institute for Advance Studies, Bangalore, India
4Department of Mathematics, Rajiv Gandhi University of Knowledge Technologies, Nuzividu, India

Tóm tắt

The diagnosis of heart disease is found to be a serious concern, so the diagnosis has to be done remotely and regularly to take the prior action. In the present world finding the prevalence of heart disease has become a key research area for the researchers and many models have been proposed in the recent year. The optimization algorithm plays a vital role in heart disease diagnosis with high accuracy. Important goal of this work is to develop a hybrid GA-ABC which represents a genetic based artificial bee colony algorithm for feature-selection and classification using classifier ensemble techniques. The ensemble classifier consists of four algorithms like support vector machine, random forest, Naïve Bayes, and decision tree. From the obtained results, the proposed model GA-ABC-EL shows increase in the classification accuracy by obtaining more than 90% when compared to the other feature selection methods.

Tài liệu tham khảo

Ang JC, Mirzal A, Haron H, Hamed HNA (2015) Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Trans Comput Biol Bioinform 13(5):971–989 Gang R, Nagarajan SM, Anandhan P (2020) Mechanism of the effect of traditional chinese medicine fumigation on blood lactic acid in exercise body. J Ambient Intell Humaniz Comput 12:1–7 Ge Z, Song Z, Ding SX, Huang B (2017) Data mining and analytics in the process industry: the role of machine learning. IEEE Access 5:20590–20616 Hira ZM, Gillies DF (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinform 2015:198363 Hu B, Dai Y, Su Y, Moore P, Zhang X, Mao C, Chen J, Xu L (2016) Feature selection for optimized high-dimensional biomedical data using an improved shuffled frog leaping algorithm. IEEE/ACM Trans Comput Biol Bioinform 15(6):1765–1773 Karaboga D, Basturk B (2008) On the performance of artificial bee colony (ABC) algorithm. Appl Soft Comput 8(1):687–697 Karunyalakshmi M, Tajunisha N (2017) Classification of cancer datasets using artificial bee colony and deep feed forward neural networks. Int J Adv Res Comput Commun Eng 62:33–41 Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2017) Feature selection: a data perspective. ACM Comput Surv (CSUR) 50(6):1–45 Manogaran G, Alazab M, Saravanan V, Rawal BS, Shakeel PM, Sundarasekar R, Nagarajan SM, Kadry SN, Montenegro-Marin CE (2020) Machine learning assisted information management scheme in service concentrated IoT. IEEE Trans Ind Inform 17(4):2871–2879 Muni Kumar N, Manjula R et al (2014) Role of big data analytics in rural health care—a step towards svasth bharath. Int J Comput Sci Inf Technol 5(6):7172–7178 Murugan NS, Devi GU (2018a) Detecting spams in social networks using ML algorithms—a review. Int J Environ Waste Manag 21(1):22–36 Murugan NS, Devi GU (2018b) Detecting streaming of Twitter spam using hybrid method. Wirel Pers Commun 103(2):1353–1374 Murugan NS, Devi GU (2019) Feature extraction using lR-PCA hybridization on Twitter data and classification accuracy using machine learning algorithms. Cluster Comput 22(6):13965–13974 Nagpal A, Gaur D (2015) Modifiedfast: a new optimal feature subset selection algorithm. J Inf Commun Converg Eng 13(2):113–122 Nalband S, Sundar A, Prince AA, Agarwal A (2016) Feature selection and classification methodology for the detection of knee-joint disorders. Comput Methods Prog Biomed 127:94–104 Ng K, Ghoting A, Steinhubl SR, Stewart WF, Malin B, Sun J (2014) PARAMO: a parallel predictive modeling platform for healthcare analytic research using electronic health records. J Biomed Inform 48:160–170 Rani ASS, Rajalaxmi R (2015) Unsupervised feature selection using binary bat algorithm. In: 2015 2nd International conference on electronics and communication systems (ICECS). IEEE, pp 451–456 Saxena K, Sharma R, et al (2015) Diabetes mellitus prediction system evaluation using c4. 5 rules and partial tree. In: 2015 4th International conference on reliability, infocom technologies and optimization (ICRITO) (trends and future directions). IEEE, pp 1–6 Shahana A, Preeja V (2016) Survey on feature subset selection for high dimensional data. In: 2016 International conference on circuit, power and computing technologies (ICCPCT). IEEE, pp 1–4 Shardlow M (2016) An analysis of feature selection techniques. Univ Manch 1:1–7 Verma L, Srivastava S, Negi P (2016) A hybrid data mining model to predict coronary artery disease cases using non-invasive clinical data. J Med Syst 40(7):178 Wankhede J, Kumar M, Sambandam P (2020) Efficient heart disease prediction-based on optimal feature selection using DFCSS and classification by improved Elman-SFO. IET Syst Biol 14(6):380–390 Xu X, Han M, Nagarajan SM, Anandhan P (2020) Industrial internet of things for smart manufacturing applications using hierarchical trustful resource assignment. Comput Commun 160:423–430 Xue B, Cervante L, Shang L, Zhang M (2012) A particle swarm optimisation based multi-objective filter approach to feature selection for classification. In: Pacific Rim international conference on artificial intelligence. Springer, pp 673–685 Zawbaa HM, Emary E, Parv B, Sharawi M (2016) Feature selection approach based on moth-flame optimization algorithm. In: 2016 IEEE Congress on evolutionary computation (CEC). IEEE, pp 4612–4617