Optimal feature selection using distance-based discrete firefly algorithm with mutual information criterion

Neural Computing and Applications - Tập 28 - Trang 2795-2808 - 2016
Long Zhang1,2, Linlin Shan3, Jianhua Wang1
1College of Computer Science and Information Engineering, Harbin Normal University, Harbin, China
2School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
3School of Art, Heilongjiang University, Harbin, China

Tóm tắt

In this paper, we investigate feature subset selection problem by a new self-adaptive firefly algorithm (FA), which is denoted as DbFAFS. In classical FA, it uses constant control parameters to solve different problems, which results in the premature of FA and the fireflies to be trapped in local regions without potential ability to explore new search space. To conquer the drawbacks of FA, we introduce two novel parameter selection strategies involving the dynamical regulation of the light absorption coefficient and the randomization control parameter. Additionally, as an important issue of feature subset selection problem, the objective function has a great effect on the selection of features. In this paper, we propose a criterion based on mutual information, and the criterion can not only measure the correlation between two features selected by a firefly but also determine the emendation of features among the achieved feature subset. The proposed approach is compared with differential evolution, genetic algorithm, and two versions of particle swarm optimization algorithm on several benchmark datasets. The results demonstrate that the proposed DbFAFS is efficient and competitive in both classification accuracy and computational performance.

Tài liệu tham khảo

Sebban M, Nock R (2002) A hybrid filter/wrapper approach of feature selection using information theory. Pattern Recogn 35(4):835–846 Jain A, Srivastava S, Singh S, Srivastava L (2013) Bacteria foraging optimization based bidding strategy under transmission congestion. IEEE Syst J. doi:10.1109/JSYST.2013.2258229 Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1–2):155–176 Lee C, Lee GG (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inf Process Manag 42(1):155–165 Fernández-García N, Medina-Carnicer R, Carmona-Poyato A, Madrid-Cuevas F, Prieto-Villegas M (2004) Characterization of empirical discrepancy evaluation measures. Pattern Recogn Lett 25(1):35–47 Sotoca JM, Pla F (2010) Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recogn 43(6):2068–2081 Cover TM, Van Campenhout JM (1977) On the possible orderings in the measurement selection problem. IEEE Trans Syst Man Cybern 7(9):657–661 Haupt RL, Haupt SE (2004) Practical genetic algorithms, 2nd edn. Wiley, New York Al-Ani A (2005) Feature subset selection using ant colony optimization. Int J Comput Intell 2(1):53–58 Firpi H, Goodman E (2004) Swarmed feature selection. In: Proceedings of international symposium on information theory, 2004. ISIT 2004, pp 112–118 Yang XS (2009) Firefly algorithms for multimodal optimization. In: Stochastic algorithms: foundations and applications, SAGA 2009, vol 5792, pp 169–178 Yang XS (2013) Multiobjective firefly algorithm for continuous optimization. Eng Comput 13(2):175–184 Kazem A, Sharifi E, Hussain FK, Saberi M, Hussain OK (2013) Support vector regression with chaos-based firefly algorithm for stock market price forecasting. Appl Soft Comput 13(2):947–958 Fister I, Fister Jr I, Yang X-S, Brest J (2013) A comprehensive review of firefly algorithms. Swarm Evolut Comput 13:34–46 Yang X-S (2010) Firefly algorithm, stochastic test functions and design optimisation. Int J Bio-Inspired Comput 2(2):78–84 Szymon l, Stawomir Z (2009) Firefly algorithm for continuous constrained optimization tasks. In: Computational collective intelligence. Semantic web, social networks and multiagent systems. Springer, pp 97–106 Yang X-S, Hosseini SSS, Gandomi AH (2012) Firefly algorithm for solving non-convex economic dispatch problems with valve loading effect. Appl Soft Comput 12(3):1180–1186 Senthilnath J, Omkar SN, Mani V (2011) Clustering using firefly algorithm: performance study. Swarm Evol Comput 1(3):164–171 Fister I Jr, Yang X-S, Fister I, Brest J (2012) Memetic firefly algorithm for combinatorial optimization. arXiv preprint arXiv:1204.5165 Horng M-H (2012) Vector quantization using the firefly algorithm for image compression. Expert Syst Appl 39(1):1078–1091 Fister I, Yang XS, Brest J, Fister I Jr (2013) Memetic self-adaptive firefly algorithm. In: Yang XS, Xiao RZC, Gandomi AH, Karamanoglu M (eds) Swarm intelligence and bio-inspired computation: theory and applications. Elsevier, Amsterdam, pp 73–102 Gálvez A, Iglesias A (2014) New memetic self-adaptive firefly algorithm for continuous optimization. Int J Bio-Inspired Comput. arXiv:1204.5165 Gálvez A, Iglesias A (2013) Firefly algorithm for polynomial Bézier surface parameterization. J Appl Math 2013:9, Article ID 237894. doi:10.1155/2013/237984 Bacanin N, Tuba M (2014) Firefly algorithm for cardinality constrained mean-variance portfolio optimization problem with entropy diversity constraint. Sci World J 2014:16, Article ID 721521. doi:10.1155/2014/721521 Gandomi AH, Yang X-S, Talatahari S, Alavi AH (2013) Firefly algorithm with chaos. Commun Nonlinear Sci Numer Simul 18(1):89–98 Coelho LDS, de Andrade Bernert DL, Mariani VC (2011) A chaotic firefly algorithm applied to reliability redundancy optimization. In: IEEE congress on evolutionary computation (CEC). IEEE, pp 517–521 Gandomi AH, Yang XS, Alavi AH (2011) Mixed variable structural optimization using firefly algorithm. Comput Struct 89(23–24):2325–2336 Sayadi MK, Hafezalkotob A, Naini SGJ (2013) Firefly-inspired algorithm for discrete optimization problems: an application to manufacturing cell formation. J Manuf Syst 32(1):78–84. doi:10.1016/j.jmsy.2012.06.004 Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on systems, man, and cybernetics, 1997. Computational cybernetics and simulation, vol 5, pp 4104–4108 Cover TM, Thomas JA (2006) Elements of information theory (Wiley series in telecommunications and signal processing). Wiley-Interscience, London Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550 Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159 Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179(13):2232–2248 Tarasewich P, McMullen PR (2002) Swarm intelligence: power in numbers. Commun ACM 45(8):62–67 Bratton D, Kennedy J (2007) Defining a standard for particle swarm optimization. In: IEEE symposium on swarm intelligence, pp 120–127 Omran M (2012) Standard particle swarm optimisation Yang XS, Deb S (2009) Cuckoo search via Lévy flights. In: World congress on nature biologically inspired computing, 2009. NaBIC 2009, pp 210–214 Leu MS, Yeh MF (2012) Grey particle swarm optimization. Appl Soft Comput 12(9):2985–2996 Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32(1):29–38 Khushaba RN, Al-Ani A, Al-Jumaily A (2011) Feature subset selection using differential evolution and a statistical repair mechanism. Expert Syst Appl 38(9):11515–11526 Liu X, Tang J (2014) Mass classification in mammograms using selected geometry and texture features, and a new SVM-based feature selection method. IEEE Syst J 8(3):910–920