Multi-generation multi-criteria feature construction using Genetic Programming

Swarm and Evolutionary Computation - Tập 78 - Trang 101285 - 2023
Jianbin Ma1,2, Xiaoying Gao3, Ying Li4
1College of Information Science and Technology, Hebei Agricultural University, Baoding 071001, China
2Hebei Key Laboratory of Agricultural Big Data, Baoding 071001, China
3School of Engineering and Computer Science, Victoria University of Wellington, Wellington 6140, New Zealand
4College of Economics and Management, Hebei Agricultural University, Baoding 071001, China

Tài liệu tham khảo

Han, 2021, Multi-objective particle swarm optimization with adaptive strategies for feature selection, Swarm Evol. Comput., 62, 10.1016/j.swevo.2021.100847 Liu, 2021, An interactive filter-wrapper multi-objective evolutionary algorithm for feature selection, Swarm Evol. Comput., 65, 10.1016/j.swevo.2021.100925 Neshatian, 2010 Koza, 1992 Banzhaf, 1998, Genetic programming: An introduction on the automatic evolution of computer programs and its applications, J. Combin. Theory, 71, 130 Majeed, 2021, Optimizing Genetic Programming by exploiting semantic impact of sub trees, Swarm Evol. Comput., 65, 10.1016/j.swevo.2021.100923 Bakurov, 2022, A novel binary classification approach based on geometric semantic genetic programming, Swarm Evol. Comput., 69, 10.1016/j.swevo.2021.101028 Neshatian, 2012, A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming, IEEE Trans. Evol. Comput., 16, 645, 10.1109/TEVC.2011.2166158 Smith, 2005, Genetic programming with a genetic algorithm for feature construction and selection, Genet. Program. Evol. Mach., 6, 265, 10.1007/s10710-005-2988-7 Tran, 2016, Genetic programming for feature construction and selection in classification on high-dimensional data, Memet. Comput., 8, 3, 10.1007/s12293-015-0173-y Muharram, 2005, Evolutionary constructive induction, IEEE Trans. Knowl. Data Eng., 17, 1518, 10.1109/TKDE.2005.182 Otero, 2003, Genetic programming for attribute construction in data mining, 384 Ahmed, 2014, Multiple feature construction for effective biomarker identification and classification using genetic programming, 249 Subasi, 2010, EEG signal classification using PCA, ICA, LDA and support vector machines, Expert Syst. Appl., 37, 8659, 10.1016/j.eswa.2010.06.065 Du, 2017, Stacked convolutional denoising auto-encoders for feature representation, IEEE Trans. Cybern., 47, 1017, 10.1109/TCYB.2016.2536638 Ma, 2020, A filter-based feature construction and feature selection approach for classification using Genetic Programming, Knowl.-Based Syst., 196, 10.1016/j.knosys.2020.105806 Loughrey, 2004, Overfitting in wrapper-based feature subset selection: The harder you try the worse it gets, 33 Dietterich, 1995, Overfitting and undercomputing in machine learning, ACM Comput. Surv., 27, 326, 10.1145/212094.212114 Srivastava, 2014, Dropout: a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15, 1929 Bramer, 2002, Using J-pruning to reduce overfitting in classification trees, Knowl.-Based Syst., 15, 301, 10.1016/S0950-7051(01)00163-0 Feng, 2014, Fundamentals of neural networks, Technometrics, 37, 470 Prechelt, 1998, Automatic early stopping using cross validation: quantifying the criteria, Neural Netw.: Off. J. Int. Neural Netw. Soc., 11, 761, 10.1016/S0893-6080(98)00010-0 Gonçalves, 2017, Unsure when to stop? Ask your semantic neighbors Nowlan, 1992, Simplifying neural networks by soft weight-sharing, Neural Comput., 4, 473, 10.1162/neco.1992.4.4.473 McNeish, 2015, Using lasso for predictor selection and to assuage overfitting: A method long overlooked in behavioral sciences, Multivar. Behav. Res., 50, 471, 10.1080/00273171.2015.1036965 Hawkins, 2004, The problem of overfitting, J. Chem. Inf. Comput. Sci., 44, 1, 10.1021/ci0342472 Lever, 2016, Model selection and overfitting, Nature Methods, 13, 703, 10.1038/nmeth.3968 Chen, 2019, Structural risk minimisation-driven genetic programming for enhancing generalisation in symbolic regression, IEEE Trans. Evol. Comput., 23, 703, 10.1109/TEVC.2018.2881392 Chan, 2011, Reducing overfitting in manufacturing process modeling using a backward elimination based genetic programming, Appl. Soft Comput., 11, 1648, 10.1016/j.asoc.2010.04.022 Vanneschi, 2010, Measuring bloat, overfitting and functional complexity in genetic programming Feng, 2017, Overfitting reduction of text classification based on AdaBELM, Entropy, 19, 330, 10.3390/e19070330 Rocha, 2017, The Naive Overfitting Index Selection (NOIS): A new method to optimize model complexity for hyperspectral data, ISPRS J. Photogramm. Remote Sens., 133, 61, 10.1016/j.isprsjprs.2017.09.012 Chen, 2017, Feature selection to improve generalisation of genetic programming for high-dimensional symbolic regression, IEEE Trans. Evol. Comput., 21, 792, 10.1109/TEVC.2017.2683489 Liu, 2016, Overfitting in linear feature extraction for classification of high-dimensional image data, Pattern Recognit., 53, 73 Batista, 2020, Improving the detection of burnt areas in remote sensing using hyper-features evolved by M3GP, 1 Guo, 2008, Feature extraction and dimensionality reduction by genetic programming based on the Fisher criterion, Expert Syst., 25, 444, 10.1111/j.1468-0394.2008.00451.x Muharram, 2004, Evolutionary feature construction using information gain and gini index, 379 Guo, 2005, Feature generation using genetic programming with application to fault classification, IEEE Trans. Syst. Man Cybern. B, 35, 89, 10.1109/TSMCB.2004.841426 Guo, 2006, Breast cancer diagnosis using genetic programming generated feature, Pattern Recognit., 39, 980, 10.1016/j.patcog.2005.10.001 Liao, 2021, Genetic programming with random binary decomposition for multi-class classification problems, 564 Firpi, 2006, On prediction of epileptic seizures by means of genetic programming artificial features, Ann. Biomed. Eng., 34, 515, 10.1007/s10439-005-9039-7 Krawiec, 2007, Generative learning of visual concepts using multiobjective genetic programming, Pattern Recognit. Lett., 28, 2385, 10.1016/j.patrec.2007.08.001 Krawiec, 2002, Genetic programming-based construction of features for machine learning and knowledge discovery tasks, Genet. Program. Evol. Mach., 3, 329, 10.1023/A:1020984725014 Muñoz, 2015, M3GP–multiclass classification with GP, 78 Lin, 2005, Evolutionary feature synthesis for object recognition, IEEE Trans. Syst. Man Cybern. C, 35, 156, 10.1109/TSMCC.2004.841912 Bhanu, 2002, Coevolutionary construction of features for transformation of representation in machine learning, 249 Krawiec, 2007, Visual learning by evolutionary and coevolutionary feature synthesis, IEEE Trans. Evol. Comput., 11, 635, 10.1109/TEVC.2006.887351 Ma, 2019, A hybrid multiple feature construction approach using Genetic Programming, Appl. Soft Comput., 80, 687, 10.1016/j.asoc.2019.04.039 Tran, 2016, Multiple feature construction in classification on high-dimensional data using GP, 1 Tran, 2017, Class dependent multiple feature construction using genetic programming for high-dimensional data, 182 Dheeru, 2017 Zhu, 2007, Markov blanket-embedded genetic algorithm for gene selection, Pattern Recognit., 40, 3236, 10.1016/j.patcog.2007.02.007 Xue, 2013, Particle swarm optimization for feature selection in classification: A multi-objective approach, IEEE Trans. Cybern., 43, 1656, 10.1109/TSMCB.2012.2227469 Luke, 2017, ECJ then and now, 1223 Bhowan, 2012, Developing new fitness functions in genetic programming for classification with unbalanced data, IEEE Trans. Syst. Man Cybern. B, 42, 406, 10.1109/TSMCB.2011.2167144 Hollander, 1999 Quinlan, 1993 Shi, 2007 Kohavi, 1996, Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324 Breiman, 1996, Bagging predictors” machine learning, Mach. Learn., 24, 123, 10.1007/BF00058655 Rodriguez, 2006, Rotation forest: A new classifier ensemble method, IEEE Trans. Pattern Anal. Mach. Intell., 28, 1619, 10.1109/TPAMI.2006.211 Katuwal, 2020, Heterogeneous oblique random forest, Pattern Recognit., 99, 10.1016/j.patcog.2019.107078 Hall, 2009, The WEKA data mining software: an update, Acm Sigkdd Explor. Newsl., 11, 10, 10.1145/1656274.1656278 Peng, 2005, Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27, 1226, 10.1109/TPAMI.2005.159 Mohapatra, 2016, Microarray medical data classification using kernel ridge regression and modified cat swarm optimization based gene selection system, Swarm Evol. Comput., 28, 144, 10.1016/j.swevo.2016.02.002 Wang, 2019, Feature selection for classification of microarray gene expression cancers using Bacterial Colony Optimization with multi-dimensional population, Swarm Evol. Comput., 48, 172, 10.1016/j.swevo.2019.04.004 Hall, 1999, Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper