An efficient approach based on multi-sources information to predict circRNAdisease associations using deep convolutional neural network

Bioinformatics (Oxford, England) - Tập 36 Số 13 - Trang 4038-4046 - 2020
Lei Wang1, Zhu‐Hong You1, Yu‐An Huang2, De-Shuang Huang3, Keith C. C. Chan2
1Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
2Department of Computing, Hong Kong Polytechnic University, Hong Kong, 999077, China
3Institute of Machine Learning and Systems Biology , School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China

Tóm tắt

Abstract Motivation Emerging evidence indicates that circular RNA (circRNA) plays a crucial role in human disease. Using circRNA as biomarker gives rise to a new perspective regarding our diagnosing of diseases and understanding of disease pathogenesis. However, detection of circRNA–disease associations by biological experiments alone is often blind, limited to small scale, high cost and time consuming. Therefore, there is an urgent need for reliable computational methods to rapidly infer the potential circRNA–disease associations on a large scale and to provide the most promising candidates for biological experiments. Results In this article, we propose an efficient computational method based on multi-source information combined with deep convolutional neural network (CNN) to predict circRNA–disease associations. The method first fuses multi-source information including disease semantic similarity, disease Gaussian interaction profile kernel similarity and circRNA Gaussian interaction profile kernel similarity, and then extracts its hidden deep feature through the CNN and finally sends them to the extreme learning machine classifier for prediction. The 5-fold cross-validation results show that the proposed method achieves 87.21% prediction accuracy with 88.50% sensitivity at the area under the curve of 86.67% on the CIRCR2Disease dataset. In comparison with the state-of-the-art SVM classifier and other feature extraction methods on the same dataset, the proposed model achieves the best results. In addition, we also obtained experimental support for prediction results by searching published literature. As a result, 7 of the top 15 circRNA–disease pairs with the highest scores were confirmed by literature. These results demonstrate that the proposed model is a suitable method for predicting circRNA–disease associations and can provide reliable candidates for biological experiments. Availability and implementation The source code and datasets explored in this work are available at https://github.com/look0012/circRNA-Disease-association. Supplementary information Supplementary data are available at Bioinformatics online.

Từ khóa


Tài liệu tham khảo

Al-Yaseen, 2017, Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system, Expert Syst. Appl, 67, 296, 10.1016/j.eswa.2016.09.041

Bahn, 2015, The landscape of microRNA, Piwi-interacting RNA, and circular RNA in human saliva, Clin. Chem, 61, 221, 10.1373/clinchem.2014.230433

Bradley, 1997, The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recogn, 30, 1145, 10.1016/S0031-3203(96)00142-2

Chen, 2016, Characterization of circular RNAs landscape in multiple system atrophy brain, J. Neurochem, 139, 485, 10.1111/jnc.13752

Chen, 2017, circRNA_100290 plays a role in oral cancer by functioning as a sponge of the miR-29 family, Oncogene, 36, 4551, 10.1038/onc.2017.89

Danan, 2012, Transcriptome-wide discovery of circular RNAs in Archaea, Nucleic Acids Res, 40, 3131, 10.1093/nar/gkr1009

Fan, 2018, CircR2Disease: a manually curated database for experimentally supported circular RNAs associated with various diseases, Database, 1, 6

Fan, 2018, Prediction of CircRNA-disease associations using KATZ model based on heterogeneous networks, Int. J. Biol. Sci, 14, 1950, 10.7150/ijbs.28260

Folador, 2014, An improved interolog mapping-based computational prediction of protein-protein interactions with increased network coverage, Integr. Biol, 6, 1080, 10.1039/C4IB00136B

Gao, 2016, Ens-PPI: a novel ensemble classifier for predicting the interactions of proteins using autocovariance transformation from PSSM, Biomed. Res. Int, 8, 1

Guo, 2006, Predicting G-protein coupled receptors-G-protein coupling specificity based on autocross-covariance transform, Proteins Struct. Funct. Bioinformatics, 65, 55, 10.1002/prot.21097

Guo, 2008, Using support vector machine combined with auto covariance to predict proteinprotein interactions from protein sequences, Nucleic Acids Res, 36, 3025, 10.1093/nar/gkn159

Hansen, 2013, Natural RNA circles function as efficient microRNA sponges, Nature, 495, 384, 10.1038/nature11993

Huang, 2006, Extreme learning machine: theory and applications, Neurocomputing, 70, 489, 10.1016/j.neucom.2005.12.126

Huang, 2011, Extreme learning machines: a survey, Int. J. Mach. Learn. Cybern, 2, 107, 10.1007/s13042-011-0019-y

Iosifidis, 2016, Graph embedded extreme learning machine, IEEE Trans. Cybern, 46, 311, 10.1109/TCYB.2015.2401973

Jeck, 2013, Circular RNAs are abundant, conserved, and associated with ALU repeats, RNA, 19, 141, 10.1261/rna.035667.112

Krizhevsky, 2012, ImageNet classification with deep convolutional neural networks, International Conference on Neural Information Processing Systems, 1097

Kruthiventi, 2017, DeepFix: a fully convolutional neural network for predicting human eye fixations, IEEE Trans. Image Process, 26, 4446–4456

Lei, 2018, PWCDA: path weighted method for predicting circRNA–disease associations, Int. J. Mol. Sci, 19, 3410, 10.3390/ijms19113410

Leire, 2017, Circular RNA profiling reveals that circular RNAs from ANXA2 can be used as new biomarkers for multiple sclerosis, Hum. Mol. Genet, 26, 3564, 10.1093/hmg/ddx243

Macintyre, 2014, Associating disease-related genetic variants in intergenic regions to the genes they impact, PeerJ, 2, e639, 10.7717/peerj.639

Memczak, 2013, Circular RNAs are a large class of animal RNAs with regulatory potency, Nature, 495, 333, 10.1038/nature11928

Nan, 2017, A novel regulatory network among LncRpa, CircRar1, MiR-671 and apoptotic genes promotes lead-induced neuronal cell apoptosis, Arch. Toxicol, 91, 1671, 10.1007/s00204-016-1837-1

Nigro, 1991, Scrambled exons, Cell, 64, 607, 10.1016/0092-8674(91)90244-S

Pan, 2018, Learning distributed representations of RNA sequences and its application for predicting RNA-protein binding sites with a convolutional neural network, Neurocomputing, 305, 51, 10.1016/j.neucom.2018.04.036

Qin, 2016, Hsa_circ_0001649: a circular RNA and potential novel biomarker for hepatocellular carcinoma, Cancer Biomark, 16, 161, 10.3233/CBM-150552

Rong, 2017, An emerging function of circRNA-miRNAs-mRNA axis in human diseases, Oncotarget, 8, 73271, 10.18632/oncotarget.19154

Salzman, 2013, Cell-type specific features of circular RNA expression, PLoS Genet, 9, e1003777, 10.1371/journal.pgen.1003777

Sanger, 1976, Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures, Proc. Natl. Acad. Sci. USA, 73, 3852, 10.1073/pnas.73.11.3852

Swets, 1988, Measuring the accuracy of diagnostic systems, Science, 240, 1285, 10.1126/science.3287615

van Laarhoven, 2011, Gaussian interaction profile kernels for predicting drug–target interaction, Bioinformatics, 27, 3036, 10.1093/bioinformatics/btr500

Wang, 2010, Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases, Bioinformatics, 26, 1644, 10.1093/bioinformatics/btq241

Wang, 2017, Advancing the prediction accuracy of protein-protein interactions by utilizing evolutionary information from position-specific scoring matrix and ensemble classifier, J. Theoret. Biol, 418, 105, 10.1016/j.jtbi.2017.01.003

Wang, 2018, Using two-dimensional principal component analysis and rotation forest for prediction of protein-protein interactions, Sci. Rep, 8, 12874, 10.1038/s41598-018-30694-1

Wang, 2018, RFDT: a rotation forest-based predictor for predicting drug-target interactions using drug structure and protein sequence information, Curr. Prot. Peptide Sci, 19, 445, 10.2174/1389203718666161114111656

Wang, 2019, Predicting protein-protein interactions from matrix-based protein sequence using convolution neural network and feature-selective rotation forest, Sci. Rep, 9, 9848, 10.1038/s41598-019-46369-4

Wang, 2019, LMTRDA: using logistic model tree to predict MiRNA–disease associations by fusing multi-source information of sequences and similarities, PLoS Comput. Biol, 15, e1006865, 10.1371/journal.pcbi.1006865

Wolf, 2015, Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed, Mol. Cell, 58, 870, 10.1016/j.molcel.2015.03.027

Xiang, 2013, A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks, BMC Syst. Biol, 7, S9, 10.1186/1752-0509-7-S3-S9

Xuan, 2013, Prediction of microRNAs associated with human diseases based on weighted k most similar neighbors, PLoS One, 8, e70204, 10.1371/journal.pone.0070204

Yan, 2018, DWNN-RLS: regularized least squares method for predicting circRNA–disease associations, BMC Bioinformatics, 19, 520, 10.1186/s12859-018-2522-6

Yu, 2017, 97

Zheng, 2019, MLMDA: a machine learning approach to predict and validate microRNA–disease associations by integrating of heterogenous information sources, J. Transl. Med, 17, 260, 10.1186/s12967-019-2009-x

Zhou, 2017, A novel identified circular RNA, circRNA_010567, promotes myocardial fibrosis via suppressing miR-141 by targeting TGF-β1, Biochem. Biophys. Res. Commun, 487, 769, 10.1016/j.bbrc.2017.04.044

Zhu, 2016, Gut microbial metabolite TMAO enhances platelet hyperreactivity and thrombosis risk, Cell, 165, 111, 10.1016/j.cell.2016.02.011

Zweig, 1993, Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine, Clin. Chem, 39, 561, 10.1093/clinchem/39.4.561