An unsupervised approach to feature discretization and selection
Tài liệu tham khảo
Aha, 1991, Instance-based learning algorithms, Machine Learning, 6, 37, 10.1007/BF00153759
V. Bolon-Canedo, S. Seth, N. Sanchez-Marono, A. Alonso-Betanzos, J. Principe, Statistical dependence measure for feature selection in microarray datasets, in: 19th European Symposium on Artificial Neural Networks-ESANN'2011. Belgium, 2011, pp. 23–28.
Boser, 1992, A training algorithm for optimal margin classifiers, 144
Clarke, 2000, Entropy and MDL discretization of continuous variables for Bayesian belief networks, International Journal of Intelligent Systems, 15, 61, 10.1002/(SICI)1098-111X(200001)15:1<61::AID-INT4>3.0.CO;2-O
Cover, 1991
Cristianini, 2000
Demsar, 2006, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1
Dougherty, 1995, Supervised and unsupervised discretization of continuous features, 194
Duda, 2001
R. Duin, P. Juszczak, P. Paclik, E. Pekalska, D. Ridder, D. Tax, S. Verzakov, PRTools4.1, a Matlab Toolbox for Pattern Recognition, Technical Report, Delft University of Technology, 2007.
Escolano, 2009
Fang, 2011, Integrative gene selection for classification of microarray data, Computer and Information Science, 4, 55
U. Fayyad, K. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: Proceedings of the International Joint Conference on Uncertainty in Artificial Intelligence, 1993, pp. 1022–1027.
A. Ferreira, M. Figueiredo, Feature transformation and reduction for text classification, in: 10th International Workshop on Pattern Recognition and Information Systems—PRIS'2010, 2010, pp. 72–81.
A. Ferreira, M. Figueiredo, Unsupervised feature selection for sparse data, in: 19th European Symposium on Artificial Neural Networks-ESANN'2011, 2011, pp. 339–344.
Forman, 2003, An extensive empirical study of feature selection metrics for text classification, Journal of Machine Learning Research, 3, 1289
A. Frank, A. Asuncion, UCI machine learning repository, 2010 〈http://archive.ics.uci.edu/ml〉.
Friedman, 1937, The use of ranks to avoid the assumption of normality implicit in the analysis of variance, Journal of the American Statistical Association, 32, 675, 10.1080/01621459.1937.10503522
Friedman, 1940, A comparison of alternative tests of significance for the problem of m rankings, The Annals of Mathematical Statistics, 11, 86, 10.1214/aoms/1177731944
Furey, 2000, Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, 16, 906, 10.1093/bioinformatics/16.10.906
Guyon, 2003, An introduction to variable and feature selection, Journal of Machine Learning Research, 3, 1157
2006
Hastie, 2001
Ho, 1998, The random subspace method for constructing decision forests, IEEE Transactions on Pattern Analysis Machine Intelligence, 20, 832, 10.1109/34.709601
Huerta, 2006, A hybrid GA/SVM approach for gene selection and classification of microarray data, 34
Joachims, 2001
Kohavi, 1997, Wrappers for feature subset selection, Artificial Intelligence, 97, 273, 10.1016/S0004-3702(97)00043-X
Lai, 2006, Random subspace method for multivariate feature selection, Pattern Recognition Letters, 27, 1067, 10.1016/j.patrec.2005.12.018
Lee, 2008, An integrated algorithm for gene selection and classification applied to microarray data of ovarian cancer, Artificial Intelligence in Medicine, 42, 81, 10.1016/j.artmed.2007.09.004
Linde, 1980, An algorithm for vector quantizer design, IEEE Transactions on Communications, 28, 84, 10.1109/TCOM.1980.1094577
Liu, 2002, Discretization: an enabling technique, Data Mining and Knowledge Discovery, 6, 393, 10.1023/A:1016304305535
L. Liu, J. Kang, J. Yu, Z. Wang, A comparative study on unsupervised feature selection methods for text clustering, in: IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2005, pp. 597–601.
Manning, 2008
Meyer, 2008, Information-theoretic feature selection in microarray data using variable complementarity, IEEE Journal of Selected Topics in Signal Processing (Special Issue on Genomic and Proteomic Signal Processing), 2, 261, 10.1109/JSTSP.2008.923858
Mitra, 2002, Unsupervised feature selection using feature similarity, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 301, 10.1109/34.990133
Peng, 2005, Feature selection based on mutual information: Criteri, of max-dependency, max-relevance and min-redundancy, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1226, 10.1109/TPAMI.2005.159
Saeys, 2007, A review of feature selection techniques in bioinformatics, Bioinformatics, 23, 2507, 10.1093/bioinformatics/btm344
Statnikov, 2005, A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis, Bioinformatics, 21, 631, 10.1093/bioinformatics/bti033
Tsai, 2008, A discretization algorithm based on class-attribute contingency coefficient, Information Sciences, 178, 714, 10.1016/j.ins.2007.09.004
Vapnik, 1999
Webb, 2005, Not so naive Bayes: aggregating one-dependence estimators, Machine Learning, 58, 5, 10.1007/s10994-005-4258-6
Witten, 2005
Yan, 2009, A formal study of feature selection in text categorization, Journal of Communication and Computer, 6, 32
L. Yu, H. Liu, Feature selection for high-dimensional data: a fast correlation-based filter solution, in: Proceedings of International Conference on Machine Learning—ICML'03, 2003, pp. 856–863.
Yu, 2004, Efficient feature selection via analysis of relevance and redundancy, Journal of Machine Learning Research, 5, 1205
Q. Zhu, L. Lin, M. Shyu, S. Chen, Effective supervised discretization for classification based on correlation maximization, in: IEEE International Conference on Information Reuse and Integration—IRI'2011, 2011, pp. 390–395.
Zien, 2009, The feature importance ranking measure, vol. 5782, 694
