Ensemble gene selection by grouping for microarray data classification

Journal of Biomedical Informatics - Tập 43 - Trang 81-87 - 2010
Huawen Liu1, Lei Liu1, Huijie Zhang2
1College of Computer Science, Jilin University, Changchun 130012, China
2College of Computer, Northeast Normal University, Changchun 130021, China

Tài liệu tham khảo

Golub, 1999, Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, 286, 531, 10.1126/science.286.5439.531 Larrañaga, 2006, Machine learning in bioinformatics, Brief Bioinform, 7, 86, 10.1093/bib/bbk007 Dupuy, 2007, Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting, J Natl Cancer Inst, 9, 147, 10.1093/jnci/djk018 Boulesteix, 2008, Evaluating microarray-based classifiers: an overview, Cancer Inform, 6, 77, 10.4137/CIN.S408 Natsoulis, 2005, Classification of a large microarray data set: algorithm comparison and analysis of drug signatures, Genome Res, 15, 724, 10.1101/gr.2807605 Somorjai, 2003, Class prediction and discovery using gene microarray and proteomics mass spectrometry data: curses, caveats, cautions, Bioinformatics, 19, 1484, 10.1093/bioinformatics/btg182 Saeys, 2007, A review of feature selection techniques in bioinformatics, Bioinformatics, 23, 2507, 10.1093/bioinformatics/btm344 Hilario, 2008, Approaches to dimensionality reduction in proteomic biomarker studies, Brief Bioinform, 9, 102, 10.1093/bib/bbn005 Nam, 2008, Gene-set approach for expression pattern analysis, Brief Bioinform, 9, 189, 10.1093/bib/bbn001 Ding, 2005, Minimum redundancy feature selection from microarray gene expression data, J Bioinform Comput Biol, 3, 185, 10.1142/S0219720005001004 Shen, 2009, New gene selection method for multiclass tumor classification by class centroid, J Biomed Inform, 42, 59, 10.1016/j.jbi.2008.05.011 Zhou, 2007, MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data, Bioinformatics, 23, 1106, 10.1093/bioinformatics/btm036 Yeh, 2008, Applying data mining techniques for cancer classification on gene expression data, Cybern Syst, 39, 583, 10.1080/01969720802188292 Zhu, 2007, Markov blanket-embedded genetic algorithm for gene selection, Pattern Recognit, 40, 3236, 10.1016/j.patcog.2007.02.007 Au, 2005, Attribute clustering for grouping, selection, and classification of gene expression data, IEEE/ACM Trans Comput Biol Bioinform, 2, 83, 10.1109/TCBB.2005.17 Yu L, Ding C, Loscalzo S. Stable feature selection via dense feature groups. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. Las Vegas, USA: ACM; 2008. p. 803–11. Boulesteix, 2008, Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value, Bioinformatics, 24, 1698, 10.1093/bioinformatics/btn262 Díaz-Uriarte, 2006, Gene selection and classification of microarray data using random forest, BMC Bioinform, 7, 3, 10.1186/1471-2105-7-3 Moon, 2007, Ensemble methods for classification of patients for personalized medicine with high-dimensional data, Artif Intell Med, 41, 197, 10.1016/j.artmed.2007.07.003 Cho, 2003, Data mining for gene expression profiles from DNA microarray, Int J Software Eng Knowledge Eng, 13, 593, 10.1142/S0218194003001469 Cho, 2007, Cancer classification using ensemble of neural networks with multiple significant gene subsets, Appl Intell, 26, 243, 10.1007/s10489-006-0020-4 Saeys, 2007, Robust feature selection using ensemble feature selection techniques, 313 Wang, 2008, A general wrapper approach to selection of class-dependent features, IEEE Trans Neural Netw, 19, 1267, 10.1109/TNN.2008.2000395 Okun, 2008, Dataset complexity in gene expression based cancer classification using ensembles of k-nearest neighbors, Artif Intell Med Yan, 2008, Selecting informative genes for discriminant analysis using multigene expression profiles, BMC Genomics, 9, S14, 10.1186/1471-2164-9-S2-S14 Ein-Dor, 2005, Outcome signature genes in breast cancer: is there a unique set?, Bioinformatics, 21, 171, 10.1093/bioinformatics/bth469 Zeng, 2008, Dimension reduction with redundant genes elimination for tumor classification, BMC Bioinform, 9, S8, 10.1186/1471-2105-9-S6-S8 Alexe, 2006, Pattern-based feature selection in genomics and proteomics, Ann Oper Res, 148, 189, 10.1007/s10479-006-0084-x 1991 Liu, 2009, Feature selection with dynamic mutual information, Pattern Recognit, 42, 1330, 10.1016/j.patcog.2008.10.028 Forman, 2003, An extensive empirical study of feature selection metrics for text classification, J Mach Learn Res, 3, 1289 Hua, 2009, Performance of feature-selection methods in the classification of high-dimension data, Pattern Recognit, 42, 409, 10.1016/j.patcog.2008.08.001 Li, 2004, A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression, Bioinformatics, 20, 2429, 10.1093/bioinformatics/bth267 Kerr, 2008, Techniques for clustering gene expression data, Comput Biol Med, 38, 283, 10.1016/j.compbiomed.2007.11.001 Yu, 2004, Efficient feature selection via analysis of relevance and redundancy, J Mach Learn Res, 5, 1205 Dietterich T. Ensemble methods in machine learning. In: Proceedings of the 1st international workshop on multiple classifier systems; 2000. p. 1–15. Tsymbal, 2005, Diversity in search strategies for ensemble feature selection, Inf Fusion, 6, 83, 10.1016/j.inffus.2004.04.003 van’t Veer, 2002, Gene expression profiling predicts clinical outcome of breast cancer, Nature, 415, 530, 10.1038/415530a Pomeroy, 2002, Prediction of central nervous system embryonal tumour outcome based on gene expression, Nature, 415, 436, 10.1038/415436a Alon, 1999, Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proc Natl Acad Sci USA, 6745, 10.1073/pnas.96.12.6745 Singh, 2002, Gene expression correlates of clinical prostate cancer behavior, Cancer Cell, 1, 203, 10.1016/S1535-6108(02)00030-2 Langley P, Iba W, Thompson K. An analysis of Bayesian classifiers. In: Proceedings of the tenth national conference on artificial intelligence; 1992. p. 223–8. Kira K, Rendell L. A practical approach to feature selection. In: Proceedings of the ninth international conference on machine learning. Morgan Kaufmann; 1992. p. 249–56. Sima, 2005, Impact of error estimation on feature selection, Pattern Recognit, 38, 2472, 10.1016/j.patcog.2005.03.026 Yu, 2008, Feature selection for genomic data analysis, 337 Yang, 2006, A stable gene selection in microarray data analysis, BMC Bioinform, 7, 228, 10.1186/1471-2105-7-228 Davis, 2006, Reliable gene signatures for microarray classification: assessment of stability and performance, Bioinformatics, 22, 2356, 10.1093/bioinformatics/btl400 Domingos P. A unified bias-variance decomposition and its applications. In: Proceedings of the seventeenth international conference on machine learning. Morgan Kaufmann, San Fransisco; 2000. p. 231–38.