A Stack-Based Ensemble Framework for Detecting Cancer MicroRNA Biomarkers

Genomics, Proteomics & Bioinformatics - Tập 15 Số 6 - Trang 381-388 - 2017
Sriparna Saha1, Sayantan Mitra1, Ravi Kant Yadav1
1Department of Computer Science and Engineering, Indian Institute of Technology, Patna 801103, India

Tóm tắt

Abstract

MicroRNA (miRNA) plays vital roles in biological processes like RNA splicing and regulation of gene expression. Studies have revealed that there might be possible links between oncogenesis and expression profiles of some miRNAs, due to their differential expression between normal and tumor tissues. However, the automatic classification of miRNAs into different categories by considering the similarity of their expression values has rarely been addressed. This article proposes a solution framework for solving some real-life classification problems related to cancer, miRNA, and mRNA expression datasets. In the first stage, a multiobjective optimization based framework, non-dominated sorting genetic algorithm II, is proposed to automatically determine the appropriate classifier type, along with its suitable parameter and feature combinations, pertinent for classifying a given dataset. In the second page, a stack-based ensemble technique is employed to get a single combinatorial solution from the set of solutions obtained in the first stage. The performance of the proposed two-stage approach is evaluated on several cancer and RNA expression profile datasets. Compared to several state-of-the-art approaches for classifying different datasets, our method shows supremacy in the accuracy of classification.

Từ khóa


Tài liệu tham khảo

Stewart, 2003, World cancer report

Lv, 2013, Searching for candidate microRNA biomarkers in detection of breast cancer: a meta-analysis, Cancer Biomark, 13, 395, 10.3233/CBM-130379

Mishra, 2014, MicroRNAs as promising biomarkers in cancer diagnostics, Biomark Res, 2, 19, 10.1186/2050-7771-2-19

Ren, 2015, Detection of miRNA as non-invasive biomarkers of colorectal cancer, Int J Mol Sci, 16, 2810, 10.3390/ijms16022810

Wu, 2012, De novo sequencing of circulating miRNAs identifies novel markers predicting clinical outcome of locally advanced breast cancer, J Transl Med, 10, 42, 10.1186/1479-5876-10-42

Gambari, 2011, Targeting microRNAs involved in human diseases: a novel approach for modification of gene expression and drug development, Biochem Pharmacol, 82, 1416, 10.1016/j.bcp.2011.08.007

Fu, 2011, miRNA biomarkers in breast cancer detection and management, J Cancer, 2, 116, 10.7150/jca.2.116

Etheridge, 2011, Extracellular microRNA: a new source of biomarkers, Mutat Res, 717, 85, 10.1016/j.mrfmmm.2011.03.004

Jacobsen, 2013, Analysis of microRNA-target interactions across diverse cancer types, Nat Struct Mol Biol, 20, 1325, 10.1038/nsmb.2678

Wei, 2016, Long non-coding RNAs and their roles in non-small-cell lung cancer, Genomics Proteomics Bioinformatics, 14, 280, 10.1016/j.gpb.2016.03.007

Yang, 2015, Databases and web tools for cancer genomics study, Genomics Proteomics Bioinformatics, 13, 46, 10.1016/j.gpb.2015.01.005

Chakraborty, 2016, miRNA-regulated cancer stem cells: understanding the property and the role of miRNA in carcinogenesis, Tumour Biol, 37, 13039, 10.1007/s13277-016-5156-1

Yang, 2016, The emerging role of extracellular vesicle-derived miRNAs: implication in cancer progression and stem cell related diseases, J Clin Epigenet, 2, 13

Liu, 2005, Toward integrating feature selection algorithms for classification and clustering, IEEE Trans Knowl Data Eng, 17, 491, 10.1109/TKDE.2005.66

Blum, 1997, Selection of relevant features and examples in machine learning, Artif Intell, 97, 245, 10.1016/S0004-3702(97)00063-5

Gaspar-Cunha, 2010, Feature selection using multi-objective evolutionary algorithms: application to cardiac SPECT diagnosis, Adv Bioinformatics, 74, 85

Deb, 2002, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans Evol Comput, 6, 182, 10.1109/4235.996017

Zhang, 2000, Introduction to statistical learning theory and support vector machines, Acta Automatica Sinica, 26, 32

Peng, 2009, Multi-class cancer classification through gene expression profiles: microRNA versus mRNA, J Genet Genomics, 36, 409, 10.1016/S1673-8527(08)60130-7

Mukhopadhyay, 2013, An SVM-wrapped multiobjective evolutionary feature selection approach for identifying cancer-microRNA markers, IEEE Trans Nanobioscience, 12, 275, 10.1109/TNB.2013.2279131

Bishop, 2006, Pattern recognition and machine learning

Olson, 2008, Advanced data mining techniques

Lu, 2005, MicroRNA expression profiles classify human cancers, Nature, 435, 834, 10.1038/nature03702

Forbes, 2011, Statistical distributions

Fan, 2001, Variable selection via nonconcave penalized likelihood and its oracle properties, J Am Stat Assoc, 96, 1348, 10.1198/016214501753382273

Tibshirani, 1996, Regression shrinkage and selection via the lasso, J R Stat Soc Series B Stat Methodol, 58, 267, 10.1111/j.2517-6161.1996.tb02080.x

Bickel, 2015, Mathematical statistics: basic ideas and selected topics

Schucany, 1981, Introduction to the theory of nonparametric statistics, SIAM Rev Soc Ind Appl Math, 23, 260

Bandyopadhyay, 2010, Development of the human cancer microRNA network, Silence, 1, 6, 10.1186/1758-907X-1-6

Chiang, 2010, Mammalian microRNAs: experimental evaluation of novel and previously annotated genes, Genes Dev, 24, 992, 10.1101/gad.1884710