Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography

Computational Statistics and Data Analysis - Tập 53 - Trang 4046-4072 - 2009
Lior Rokach1
1Department of Information System Engineering, Ben-Gurion University of the Negev, Israel

Tài liệu tham khảo

Abeel, 2009, Java-ML: A machine learning library, Journal of Machine Learning Research, 10, 931 Adem, 2004, Aggregating classifiers with mathematical programming, Computational Statistics and Data Analysis, 47, 791, 10.1016/j.csda.2003.11.015 Ahn, 2007, Classification by ensembles from random partitions of high-dimensional data, Computational Statistics and Data Analysis, 51, 6166, 10.1016/j.csda.2006.12.043 Ali, 1996, Error reduction through learning multiple descriptions, Machine Learning, 24, 173, 10.1007/BF00058611 Altincay, 2007, Decision trees using model ensemble-based nodes Pattern, Recognition, 40, 3540, 10.1016/j.patcog.2007.03.023 Anand, 1995, Efficient classification for multiclass problems using modular neural networks, IEEE Transactions on Neural Networks, 6, 117, 10.1109/72.363444 Arbel, 2006, Classifier evaluation under limited resources, Pattern Recognition Letters, 27, 1619, 10.1016/j.patrec.2006.03.008 Archer, 2008, Empirical characterization of random forest variable importance measures, Computational Statistics and Data Analysis, 52, 2249, 10.1016/j.csda.2007.08.015 Averbuch, 2004, Context-sensitive medical information retrieval, 282 Banfield, 2007, A comparison of decision tree ensemble creation techniques, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 173, 10.1109/TPAMI.2007.250609 Banfield, R., OpenDT. http://opendt.sourceforge.net/, 2005 Bao, 2002, Combining multiple K-nearest neighbor classifiers for text classification by reducts, vol. 2534, 340 Bartlett, 1998, Generalization performance of support vector machines and other pattern classifiers Bauer, 1999, An empirical comparison of voting classification algorithms: Bagging, boosting, and variants, Machine Learning, 35, 1 Bay, 1999, Nearest neighbor classification from multiple feature subsets, Intelligent Data Analysis, 3, 191, 10.3233/IDA-1999-3304 Bennett, 2002, Exploiting unlabeled data in ensemble methods, 289 Brazdil, 1994, Characterizing the Applicability of Classification Algorithms using Meta Level Learning, vol. 784, 83 Breiman, 1996, Bagging predictors, Machine Learning, 24, 123, 10.1007/BF00058655 Breiman, 1998, Arcing classifiers, Annals of Statistics, 26, 801 Breiman, L., 1999. Pasting small votes for classification in large databases and on-line, Machine Learning, 36, pp. 85–103 Breiman, 2001, Random forests, Machine Learning, 45, 5, 10.1023/A:1010933404324 Brodley, 1995, Recursive automatic bias selection for classifier construction, Machine Learning, 20, 63, 10.1007/BF00993475 Brown, 2003, Negative correlation learning and the ambiguity family of ensemble methods, Multiple Classifier Systems, 266, 10.1007/3-540-44938-8_27 Brown, 2005, Diversity creation methods: A survey and categorisation, Information Fusion, 6, 5, 10.1016/j.inffus.2004.04.004 Bruzzone, 2004, Detection of land-cover transitions by combining multidate classifiers, Pattern Recognition Letters, 25, 1491, 10.1016/j.patrec.2004.06.002 Bryll, 2003, Attribute bagging: Improving accuracy of classifier ensembles by using random feature subsets, Pattern Recognition, 36, 1291, 10.1016/S0031-3203(02)00121-8 Buntine, 1996, Graphical models for discovering knowledge, 59 Buttrey, 2002, Using k-nearest-neighbor classification in the leaves of a tree, Computational Statistics and Data Analysis, 40, 27, 10.1016/S0167-9473(01)00098-6 Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A., 2004. Ensemble selection from libraries of models, In: Twenty-first International Conference on Machine Learning, July 04–08, Banff, Alberta, Canada Chan, P.K., Stolfo, S.J., A comparative evaluation of voting and meta-learning on partitioned data. In: Proc. 12th Intl. Conf. On Machine Learning ICML-95, 1995 Chan, 1997, On the accuracy of meta-learning for scalable data mining, Journal of Intelligent Information Systems, 8, 5, 10.1023/A:1008640732416 Chawla, 2004, Learning ensembles from bites: A scalable and accurate approach, Journal of Machine Learning Research, 5, 421 Christensen, 2004, Designing committees of models through deliberate weighting of data points, The Journal of Machine Learning Research, 4, 39 Christmann, 2007, Robust learning from bites for data mining, Computational Statistics and Data Analysis, 52, 347, 10.1016/j.csda.2006.12.009 Clark, 1991, Rule induction with CN2: Some recent improvements, 151 Cohen, 2007, Decision tree input space decomposition with grouped gain-ratio, Information Sciences, 177, 3592, 10.1016/j.ins.2007.01.016 Croux, 2007, Trimmed bagging, Computational Statistics and Data Analysis, 52, 362, 10.1016/j.csda.2007.06.012 Cunningham, 2000, Diversity versus quality in classification ensembles based on feature selection, vol. 1810, 109 Dasarathy, 1979, Composite classifier system design: Concepts and methodology, Proceedings of the IEEE, 67, 708, 10.1109/PROC.1979.11321 Demsar, J., Zupan, B., Leban, G., 2004. Orange: From experimental machine learning to interactive data mining, White Paper (www.ailab.si/orange), Faculty of Computer and Information Science, University of Ljubljana Denison, 2002, Bayesian partition modelling, Computational Statistics and Data Analysis, 38, 475, 10.1016/S0167-9473(01)00073-1 Derbeko, 2002, Variance optimized bagging Džeroski, 2004, Is combining classifiers with stacking better than selecting the best one?, Machine Learning, 54, 255, 10.1023/B:MACH.0000015881.36452.6e Dietterich, 1995, Solving multiclass learning problems via error-correcting output codes, Journal of Artificial Intelligence Research, 2, 263, 10.1613/jair.105 Dietterich, 2000, An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting and Randomization, 40, 139 Dietterich, 2000, Ensemble methods in machine learning, 1 Dimitrakakis, 2005, Online adaptive policies for ensemble classifiers, Neurocomputing, 64, 211, 10.1016/j.neucom.2004.11.031 Domingos, 1996, Using partitioning to speed up specific-to-general rule induction, 29 Drucker, 2002, Effect of pruning and early stopping on performance of a boosting ensemble, Computational Statistics and Data Analysis, 38, 393, 10.1016/S0167-9473(01)00067-6 Duin, R.P.W., 2002. The combining classifier: To train or not to train? In: Proc. 16th International Conference on Pattern Recognition, ICPR’02, Canada, pp. 765–770 Elovici, 2002, Using the information structure model to compare profile-based information filtering systems, Information Retrieval Journal, 6, 75, 10.1023/A:1022952531694 Elovici, 2006, A decision theoretic approach to combining information filters: Analytical and empirical evaluation, Journal of the American Society for Information Science and Technology (JASIST), 57, 306, 10.1002/asi.20278 Frank, 2005, WEKA — A machine learning workbench for data mining, 1305 Freund, Yoav, Mason, Llew, 1999. The alternating decision tree algorithm. In: Proceedings of the 16th International Conference on Machine Learning. pp. 124–133 Freund, Y., Schapire, R.E., 1996. Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 325–332 Friedman, 1991, Multivariate adaptive regression splines, The Annual of Statistics, 19, 1, 10.1214/aos/1176347963 Friedman, 2000, Additive logistic regression: A statistical view of boosting, Annals of Statistics, 28, 337, 10.1214/aos/1016218223 Friedman, 2002, Stochastic gradient boosting, Computational Statistics and Data Analysis, 38, 367, 10.1016/S0167-9473(01)00065-2 Gama, 2000, A Linear-Bayes classifier, vol. 1952, 269 Gams, 1989, New measurements highlight the importance of redundant knowledge Gey, 2006, Boosting and instability for regression trees, Computational Statistics and Data Analysis, 50, 533, 10.1016/j.csda.2004.09.001 Gunter, 2004, Feature selection algorithms for the generation of multiple classifier systems, Pattern Recognition Letters, 25, 1323, 10.1016/j.patrec.2004.05.002 Hansen, J., 2000. Combining predictors. In: Meta Machine Learning Methods and Bias, Variance & Ambiguity Decompositions. Ph.D. dissertation. Aurhus University Hansen, 1990, Neural network ensembles, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 993, 10.1109/34.58871 Ho, 1998, The random subspace method for constructing decision forests, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832, 10.1109/34.709601 Ho, 2001, Data complexity analysis for classifier combination, vol. 2096, 53 Ho, 2002, Multiple classifier combination: Lessons and next steps, 171 Ho, T.K., 1998. Nearest neighbors in random subspaces, In: Proc. of the Second International Workshop on Statistical Techniques in Pattern Recognition, Sydney, Australia, August 11–13, pp. 640–648 Ho, 1994, Decision combination in multiple classifier systems, PAMI, 16, 66, 10.1109/34.273716 Hothorn, 2005, Bundling classifiers by bagging trees, Computational Statistics and Data Analysis, 49, 1068, 10.1016/j.csda.2004.06.019 Hu, X., 2001. Using rough sets theory and database operations to construct a good ensemble of classifiers for data mining applications. In: ICDM01. pp. 233–240 Hu, 2005, Constructing rough decision forests, vol. 3642, 147 Hu, 2007, EROS: Ensemble rough subspaces, Pattern Recognition, 40, 3728, 10.1016/j.patcog.2007.04.022 Huang, 1995, A method of combining multiple experts for the recognition of unconstrained handwritten numerals, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 90, 10.1109/34.368145 Ingo, Mierswa, Michael, Wurst, Ralf, Klinkenberg, Martin, Scholz, Timm, Euler, 2006. YALE: Rapid prototyping for complex data mining tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06) Islam, 2003, A constructive algorithm for training cooperative neural network ensembles, IEEE Transactions on Neural Networks, 14, 820, 10.1109/TNN.2003.813832 Jacobs, 1991, Adaptive mixtures of local experts, Neural Computation, 3, 79, 10.1162/neco.1991.3.1.79 Jordan, 1994, Hierarchical mixtures of experts and the EM algorithm, Neural Computation, 6, 181, 10.1162/neco.1994.6.2.181 Kamel, 2003, Data dependence in combining classifiers, vol. 2709, 1 Kang, 2005, Combination of multiple classifiers by minimizing the upper bound of bayes error rate for unconstrained handwritten numeral recognition, International Journal of Pattern Recognition and Artificial Intelligence, 19, 395, 10.1142/S0218001405004101 Kim, 2005, Inverse boosting for monotone regression functions, Computational Statistics and Data Analysis, 49, 757, 10.1016/j.csda.2004.05.038 Kohavi, R., 1996. Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. pp. 114–119 Kolen, 1991, Back propagation is sensitive to initial conditions, vol. 3, 860 Krogh, 1995, Neural network ensembles, cross validation and active learning, vol. 7, 231 Kuncheva, 2005 Kuncheva, 2005, Diversity in multiple classifier systems (Editorial), Information Fusion, 6, 3, 10.1016/j.inffus.2004.04.009 Kuncheva, 2003, Measures of diversity in classifier ensembles and their relationship with ensemble accuracy, Machine Learning, 51, 181, 10.1023/A:1022859003006 Kusiak, 2000, Decomposition in data mining: An industrial case study, IEEE Transactions on Electronics Packaging Manufacturing, 23, 345, 10.1109/6104.895081 Lam, 2000, Classifier combinations: Implementations and theoretical issues, 78 Langdon, 2002, Combining decision trees and neural networks for drug discovery, 60 Leigh, 2002, Forecasting the NYSE composite index with technical analysis, pattern recognizer, neural networks, and genetic algorithm: A case study in romantic decision support, Decision Support Systems, 32, 361, 10.1016/S0167-9236(01)00121-X Li, 2006, Multitraining support vector machine for image retrieval, IEEE Transactions on Image Processing, 15, 3597, 10.1109/TIP.2006.881938 Liao, 2000, Constructing heterogeneous committees via input feature grouping, vol. 12 Lin, 2006, Content-based image retrieval trained by adaboost for mobile application, International Journal of Pattern Recognition and Artificial Intelligence, 20, 525, 10.1142/S021800140600482X Lin, 2005, Combining multiple classifiers based on a statistical method for handwritten chinese character recognition, International Journal of Pattern Recognition and Artificial Intelligence, 19, 1027, 10.1142/S0218001405004459 Liu, 2004, An empirical study of building compact ensembles, 622 Liu, 2005, Classifier combination based on confidence transformation, Pattern Recognition, 38, 11, 10.1016/j.patcog.2004.05.013 Lu, 1999, Task decomposition and module combination based on class relations: A modular neural network for pattern classification, IEEE Transactions on Neural Networks, 10, 1244, 10.1109/72.788664 Maimon, 2002, Improving supervised learning by feature decomposition, 178 Maimon, 2005, vol. 61 Maimon, 2001, Data Mining by Attribute Decomposition with semiconductors manufacturing case study, 311 Mangiameli, 2004, Model selection for medical diagnosis decision support systems, Decision Support Systems, 36, 247, 10.1016/S0167-9236(02)00143-4 Margineantu, D., Dietterich, T., 1997. Pruning adaptive boosting. In: Proc. Fourteenth Intl. Conf. Machine Learning. pp. 211–218 Melville, Prem, Mooney, Raymond J., 2003. Constructing diverse classifier ensembles using artificial training examples, In: IJCAI 2003. pp. 505–512 Menahem, 2009, Improving malware detection by applying multi-inducer ensemble, Computational Statistics and Data Analysis, 53, 1483, 10.1016/j.csda.2008.10.015 Menahem, E., Rokach, L., Elovici, Y., Troika — An improved stacking schema for classification tasks, Information Sciences (in press) Merkwirth, 2004, Ensemble methods for classification in cheminformatics, Journal of Chemical Information and Modeling, 44, 1971 Merler, 2007, Parallelizing AdaBoost by weights dynamics, Computational Statistics and Data Analysis, 51, 2487, 10.1016/j.csda.2006.09.001 Merz, 1999, Using correspondence analysis to combine classifier, Machine Learning, 36, 33, 10.1023/A:1007559205422 Michalski, 1994 Mitchell, 1997 Moskovitch, 2008, Detection of unknown computer worms based on behavioral classification of the host, Computational Statistics and Data Analysis, 52, 4544, 10.1016/j.csda.2008.01.028 Nowlan, 1991, Evaluation of adaptive mixtures of competing experts, vol. 3, 774 Opitz, 1999, Feature selection for ensembles, 379 Opitz, 1999, Popular ensemble methods: An empirical study, Journal of Artificial Research, 11, 169 Opitz, 1996, Generating accurate and diverse members of a neural-network ensemble, vol. 8, 535 Parmanto, 1996, Improving committee diagnosis with resampling techniques, vol. 8, 882 Partridge, 1996, Engineering multiversion neural-net systems, Neural Computation, 8, 869, 10.1162/neco.1996.8.4.869 Phama, 2008, Quadratic boosting, Pattern Recognition, 41, 331, 10.1016/j.patcog.2007.05.008 Polikar, 2006, Ensemble based systems in decision making, IEEE Circuits and Systems Magazine, 6, 21, 10.1109/MCAS.2006.1688199 Prodromidis, A.L., Stolfo, S.J., Chan, P.K., 1999. Effective and efficient pruning of metaclassifiers in a distributed Data Mining system. Technical report CUCS-017-99, Columbia Univ Provost, F.J., Kolluri, V., 1997. A survey of methods for scaling up inductive learning algorithms. In: Proc. 3rd International Conference on Knowledge Discovery and Data Mining Quinlan, J.R., 1996. Bagging, Boosting, and C4.5. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence. pp. 725–730 Ricco Rakotomalala, 2005. TANAGRA: A free software for research and academic purposes. In: Proceedings of EGC’2005, RNTI-E-3, vol. 2. pp. 697–702 R Development Core Team, 2004. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-00-3. http://cran.r-project.org/, 2004 Ridgeway, 2002, Looking for lumps: Boosting and bagging for density estimation, Computational Statistics and Data Analysis, 38, 379, 10.1016/S0167-9473(01)00066-4 Rokach, 2008, Genetic algorithm-based feature set partitioning for classification problems, Pattern Recognition, 41, 1676, 10.1016/j.patcog.2007.10.013 Rokach, 2008, Mining manufacturing data using genetic algorithm-based feature set decomposition, International Journal of Intelligent Systems Technologies and Applications, 4, 57, 10.1504/IJISTA.2008.016359 Rokach, 2006, Data mining for improving the quality of manufacturing: A feature set decomposition approach, Journal of Intelligent Manufacturing, 17, 285, 10.1007/s10845-005-0005-x Rokach, 2006, Decomposition methodology for classification tasks — a meta decomposer framework, Pattern Analysis and Applications, 9, 257, 10.1007/s10044-006-0041-y Rokach, 2005, Clustering methods, 321 Rokach, 2009, Collective-agreement-based pruning of ensembles, Computational Statistics and Data Analysis, 53, 1015, 10.1016/j.csda.2008.12.001 Rokach, 2001, Theory and applications of attribute decomposition, 473 Rokach, 2006, Selective voting — getting more for less in sensor fusion, International Journal of Pattern Recognition and Artificial Intelligence, 20, 329, 10.1142/S0218001406004739 Rokach, 2007, A methodology for improving the performance of non-ranker feature selection filters, International Journal of Pattern Recognition and Artificial Intelligence, 21, 809, 10.1142/S0218001407005727 Rokach, 2005, Feature set decomposition for decision trees, Journal of Intelligent Data Analysis, 9, 131, 10.3233/IDA-2005-9202 Rokach, 2005, Improving supervised learning by sample decomposition, International Journal of Computational Intelligence and Applications, 5, 37, 10.1142/S146902680500143X Rokach, 2003, Space decomposition in data mining: A clustering approach, 24 Rokach, 2004, Information retrieval system for medical narrative reports, vol. 3055, 217 Rokach, 2008, Negation recognition in medical narrative reports, Information Retrieval, 11, 499, 10.1007/s10791-008-9061-0 Rosen, 1996, Ensemble learning using decorrelated neural networks, Connection Science, 8, 373, 10.1080/095400996116820 Rudin, 2004, The dynamics of Adaboost: Cyclic behavior and convergence of margins, Journal of Machine Learning Research, 5, 1557 Schaffer, 1993, Selecting a classification method by cross-validation, Machine Learning, 13, 135, 10.1007/BF00993106 Schapire, 1990, The strength of weak learnability, Machine Learning, 5, 197, 10.1007/BF00116037 Schclar, 2009, Random projection ensemble classifiers, 309 Schclar, A., Rokach, L., Meisels, A., 2009. Ensemble methods for improving the performance of neighborhood-based collaborative filtering. In: Proc. ACM RecSys (in press) Seewald, A.K., Fürnkranz, J., 2001. Grading classifiers, Austrian Research Institute for Artificial Intelligence Seewald, 2002, How to make stacking better and faster while also taking care of an unknown weakness, 554 Sexton, 2008, LogitBoost with errors-in-variables, Computational Statistics and Data Analysis, 52, 2549, 10.1016/j.csda.2007.09.004 Sharkey, 1996, On combining artificial neural nets, Connection Science, 8, 299, 10.1080/095400996116785 Sharkey, 1997, Combining diverse neural networks, The Knowledge Engineering Review, 12, 231, 10.1017/S0269888997003123 Sharkey, 1999, Multi-Net systems, 1 Sharkey, 2002, Types of multinet system, vol. 2364, 108 Shilen, 1990, Multiple binary tree classifiers, Pattern Recognition, 23, 757, 10.1016/0031-3203(90)90098-6 Skurichina, 2002, Bagging, boosting and the random subspace method for linear classifiers, Pattern Analysis and Applications, 5, 121, 10.1007/s100440200011 Sohn, S.Y., Choi, H., 2001. Ensemble based on Data Envelopment Analysis. ECML Meta Learning Workshop, September 4 Sohna, 2007, Experimental study for the comparison of classifier combination methods, Pattern Recognition, 40, 33, 10.1016/j.patcog.2006.06.027 Sivalingam, 2005, Minimal classification method with error-correcting codes for multiclass recognition, International Journal of Pattern Recognition and Artificial Intelligence, 19, 663, 10.1142/S0218001405004241 Sun, 2006, Reducing the overfitting of adaboost by controlling its data distribution skewness, International Journal of Pattern Recognition and Artificial Intelligence, 20, 1093, 10.1142/S0218001406005137 Tan, 2003, Multi-class protein fold classification using a new ensemble machine learning approach, Genome Informatics, 14, 206 Tao, 2006, Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1088, 10.1109/TPAMI.2006.134 Tao, 2007, Negative samples analysis in relevance feedback, IEEE Transactions on Knowledge and Data Engineering, 19, 568, 10.1109/TKDE.2007.1003 Tao, Dacheng, Tang, Xiaoou, 2004. SVM-based relevance feedback using random subspace method. In: IEEE International Conference on Multimedia and Expo, pp. 647–652 Towell, 1994, Knowledge-based artificial neural networks, Artificial Intelligence, 70, 119, 10.1016/0004-3702(94)90105-8 Tsao, 2007, A stochastic approximation view of boosting, Computational Statistics and Data Analysis, 52, 325, 10.1016/j.csda.2007.06.020 Tsymbal, 2002, Ensemble feature selection with the simple bayesian classification in medical diagnostics, 225 Tsymbal, 2005, Diversity in search strategies for ensemble feature selection, Information Fusion, 6, 83, 10.1016/j.inffus.2004.04.003 Tukey, 1977 Tumer, 1996, Error correlation and error reduction in ensemble classifiers, Connection Science, 8, 385, 10.1080/095400996116839 Tumer, 2000, Robust order statistics based ensembles for distributed data mining, 185 Tumer, 2003, Input decimated ensembles, Pattern Analysis and Application, 6, 65, 10.1007/s10044-002-0181-7 Tutz, G., Binder, H., 2006. Boosting ridge regression. Computational Statistics and Data Analysis. Corrected Proof, Available online 22 December 2006, in press (doi:10.1016/j.csda.2006.11.041) Valentini, 2002, Ensembles of learning machines, vol. 2486, 3 Vilalta, 2005, Meta-Learning, 731 Wanas, 2006, Adaptive fusion and co-operative training for classifier ensembles, Pattern Recognition, 39, 1781, 10.1016/j.patcog.2006.02.003 Wang, 2000, Diversity between neural networks and decision trees for building multiple classifier systems, vol. 1857, 240 Webb, 2000, MultiBoosting: A technique for combining boosting and wagging, Machine Learning, 40, 159, 10.1023/A:1007659514849 Webb, 2004, Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques, IEEE Transactions on Knowledge and Data Engineering, 16, 10.1109/TKDE.2004.29 Windeatt, 2001, An empirical comparison of pruning methods for ensemble classifiers, vol. 2189, 208 Woods, 1997, Combination of multiple classifiers using local accuracy estimates, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 405, 10.1109/34.588027 Wolpert, 1992, Stacked generalization, Neural Networks, 5, 241, 10.1016/S0893-6080(05)80023-1 Wu, 2005, Multi-knowledge for decision making, Journal Knowledge and Information Systems, 7, 246, 10.1007/s10115-004-0150-0 Xu, 1992, Methods of combining multiple classifiers and their application to handwriting recognition, IEEE Transactions on SMC, 22, 418 Yates, 1996, Use of methodological diversity to improve neural network generalization, Neural Computing and Applications, 4, 114, 10.1007/BF01413747 Zenobi, G., Cunningham, P., 2001. Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. In: Proceedings of the European Conference on Machine Learning Zhang, 2008, A local boosting algorithm for solving classification problems, Computational Statistics and Data Analysis, 52, 1928, 10.1016/j.csda.2007.06.015 Zhang, 2008, Using Boosting to prune Double-Bagging ensembles, Computational Statistics and Data Analysis Zhou, 2003, Selective Ensemble of Decision Trees, vol. 2639, 476 Zhou, 2002, Ensembling neural networks: Many could be better than all, Artificial Intelligence, 137, 239, 10.1016/S0004-3702(02)00190-X Zhou, 2004, NeC4.5: Neural Ensemble Based C4.5, IEEE Transactions on Knowledge and Data Engineering, 16, 770, 10.1109/TKDE.2004.11 Zhoua, 2008, Data-driven decomposition for multi-class classification, Pattern Recognition, 41, 67, 10.1016/j.patcog.2007.05.020 Zupan, 1998, Feature transformation by function decomposition, IEEE Intelligent Systems & Their Applications, 13, 38, 10.1109/5254.671090