A novel method to identify gene interaction patterns

Springer Science and Business Media LLC - Tập 22 - Trang 1-15 - 2021
Xinguo Lu1, Fang Liu1, Qiumai Miao1, Ping Liu2, Yan Gao1, Keren He1
1College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
2Hunan Want Want Hospital, Changsha, China

Tóm tắt

Gene interaction patterns, including modules and motifs, can be used to identify cancer specific biomarkers and to reveal the mechanism of tumorigenesis. Most of the existing module network inferencing methods focus on gene independent functional patterns, while the studies of overlapping characteristics between modules are lacking. The objective of this study was to reveal the functional overlapping patterns in gene modules, helping elucidate the regulatory relationship between overlapping genes and communities, as well as to explore cancer formation and progression. We analyzed six cancer datasets from The Cancer Genome Atlas and obtained three kinds of gene functional modules for each cancer, including Independent-Community, Dependent-Community and Merged-Community. In the six cancers, 59(3.5%) Independent-Communities were identified, while 1631(96.5%) Dependent-Communities were acquired. Compared with Lemon-Tree and K-Means, the gene communities identified by our method were enriched in more known GO categories with lower p-values. Meanwhile, those identified distinguishing communities can significantly distinguish the survival prognostic of patients by Kaplan-Meier analysis. Furthermore, identified driver genes in the gene communities can be considered as biomarkers which can accurately distinguish the tumour or normal samples for each cancer type. In all identified communities, Dependent-Communities are the majority. Our method is more effective than the other two methods which do not consider the overlapping characteristics of modules. This indicates that overlapping genes are located in different specific functional groups, and a communication bridge is established between the communities to construct a comprehensive carcinogenesis.

Tài liệu tham khảo

Zhang W, Wang SL. An integrated framework for identifying mutated driver pathway and cancer progression. IEEE/ACM Trans Comput Biol Bioinforma. 2019. https://doi.org/10.1109/TCBB.2017.2788016. Huang D-S, Zhang L, Han K, Deng S, Yang K, Zhang H. Prediction of Protein-Protein Interactions Based on Protein-Protein Correlation Using Least Squares Regression. Curr Protein Pept Sci. 2014. https://doi.org/10.2174/1389203715666140724084019. Shendure J, Ji H. Next-generation DNA sequencing. 2008. https://doi.org/10.1038/nbt1486. Tomczak K, Czerwińska P, Wiznerowicz M. The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge. 2015. https://doi.org/10.5114/wo.2014.47136. Forbes S, Clements J, Dawson E, Bamford S, Webb T, Dogan A, Flanagan A, Teague J, Wooster R, Futreal PA. Cosmic 2005. Br J Cancer. 2006; 94(2):318–22. https://doi.org/10.1038/sj.bjc.6602928. You ZH, Lei YK, Gui J, Huang DS, Zhou X. Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data. Bioinformatics. 2010. https://doi.org/10.1093/bioinformatics/btq510. Xia JF, Zhao XM, Song J, Huang DS. APIS: Accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinformatics. 2010. https://doi.org/10.1186/1471-2105-11-174. Tao H, Min J, Kong X, Cai YD. Dysfunctions associated with methylation, microrna expression and gene expression in lung cancer. PLoS ONE. 2012; 7(8):43441. https://doi.org/10.1371/journal.pone.0043441. Lu X, Wang X, Ding L, Gao Y, He K. frdriver: A functional region driver identification for protein sequence. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2020. https://doi.org/10.1109/TCBB.2020.3020096. Nepusz T, Petróczi A, Négyessy L, Bazsó F. Fuzzy communities and the concept of bridgeness in complex networks. Phys Rev E Stat Nonlinear Soft Matter Phys. 2008. https://doi.org/10.1103/PhysRevE.77.016107. Althammer S, Pagès A, Eyras E. Predictive models of gene regulation from high-throughput epigenomics data. Comp Funct Genomics. 2012. https://doi.org/10.1155/2012/284786. Lu X, Qian X, Li X, Miao Q, Peng S. Dmcm: a data-adaptive mutation clustering method to identify cancer-related mutation clusters. Bioinformatics. 2019; 35(3):389–97. Lu X, Li X, Liu P, Qian X, Miao Q, Peng S. The integrative method based on the module-network for identifying driver genes in cancer subtypes. Molecules. 2018; 23(2):183. Friedman N. Inferring cellular networks using probabilistic graphical models. Science. 2004; 303(5659):799–805. https://doi.org/10.1126/science.1094068. Lu X, Lu J, Liao B, Li X, Qian X, Li K. Driver pattern identification over the gene co-expression of drug response in ovarian cancer by integrating high throughput genomics data. Sci Rep. 2017. https://doi.org/10.1038/s41598-017-16286-5. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci. 1998; 95(25):14863–8. Joshi A, Van de Peer Y, Michoel T. Analysis of a gibbs sampler method for model-based clustering of gene expression data. Bioinformatics. 2008; 24(2):176–83. Dahl DB. Model-based clustering for expression data via a dirichlet process mixture model. In: Bayesian inference for gene expression and proteomics. Cambridge: Cambridge University Press: 2006. Michoel T, De Smet R, Joshi A, Van de Peer Y, Marchal K. Comparative analysis of module-based versus direct methods for reverse-engineering transcriptional regulatory networks. BMC Syst Biol. 2009; 3(1):49. Roy S, Lagree S, Hou Z, Thomson JA, Stewart R, Gasch AP. Integrated module and gene-specific regulatory inference implicates upstream signaling networks. PLoS Comput Biol. 2013; 9(10):1003252. Bonnet E, Calzone L, Michoel T. Integrative multi-omics module network inference with lemon-tree. Plos Comput Biol. 2015; 11(2):1003983. https://doi.org/10.1371/journal.pcbi.1003983. Wang Z, Zhang D, Zhou X, Yang D, Yu Z, Yu Z. Discovering and profiling overlapping communities in location-based social networks. IEEE Trans Syst Man Cybern Syst. 2014. https://doi.org/10.1109/TSMC.2013.2256890. Hou JP, Ma J. DawnRank: Discovering personalized driver genes in cancer. Genome Med. 2014. https://doi.org/10.1186/s13073-014-0056-8. Nolan D, Ginsberg M, Israely E, Palikuqi B, Poulos MG, James D, Ding BS, Schachterle W, Liu Y, Rosenwaks Z. Molecular signatures of tissue-specific microvascular endothelial cell heterogeneity in organ maintenance and regeneration. Dev Cell. 2013; 26(2):204. https://doi.org/10.1016/j.devcel.2013.06.017. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: A software Environment for integrated models of biomolecular interaction networks. Genome Res. 2003. https://doi.org/10.1101/gr.1239303. Rives AW, Galitski T. Proc Natl Acad Sci U S A. 2003; 100(3):1128–33. https://doi.org/10.1073/pnas.0237338100. D’Haeseleer P. How does gene expression clustering work?. Nat Biotechnol. 2005; 23(12):1499. https://doi.org/10.1038/nbt1205-1499. Spirin V, Mirny LA. Proc Natl Acad Sci U S A. 2003; 100(21):12123–8. https://doi.org/10.1073/pnas.2032324100. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000. https://doi.org/10.1111/j.0006-341X.2000.00337.x. Hofland J, Delhanty PJ, Steenbergen J, Hofland LJ, van Koetsveld PM, van Nederveen FH, de Herder WW, Feelders RA, de Jong FH. Melanocortin 2 receptor-associated protein (mrap) and mrap2 in human adrenocortical tissues: regulation of expression and association with acth responsiveness. J Clin Endocrinol. 2012; 97(5):747–54. Daves MH, Hilsenbeck SG, Lau CC, Man TK. Meta-analysis of multiple microarray datasets reveals a common gene signature of metastasis in solid tumors. BMC Med Genomics. 2011. https://doi.org/10.1186/1755-8794-4-56. Zhang X, Sheng J, Zhang Y, Tian Y, Zhu J, Luo N, Xiao C, Li R. Overexpression of SCAMP3 is an indicator of poor prognosis in hepatocellular carcinoma. Oncotarget. 2017. https://doi.org/10.18632/oncotarget.22665. Aoh QL, Castle AM, Hubbard CH, Katsumata O, Castle JD. SCAMP3 Negatively Regulates Epidermal Growth Factor Receptor Degradation and Promotes Receptor Recycling. Mol Biol Cell. 2009. https://doi.org/10.1091/mbc.e08-09-0894. Balasubramani A, Larjo A, Bassein JA, Chang X, Hastie RB, Togher SM, Lähdesmäki H, Rao A. Cancer-associated asxl1 mutations may act as gain-of-function mutations of the asxl1–bap1 complex. Nat Commun. 2015; 6(1):1–15. Nii K, Tokunaga Y, Liu D, Zhang X, Nakano J, Ishikawa S, Kakehi Y, Haba R, Yokomise H. Overexpression of g protein-coupled receptor 87 correlates with poorer tumor differentiation and higher tumor proliferation in non-small-cell lung cancer. Mol Clin Oncol. 2014; 2(4):539–44. Zhang ZF, Zhang HR, Zhang QY, Lai SY, Feng YZ, Zhou Y, Zheng SR, Shi R, Zhou JY. High expression of TMEM40 is associated with the malignant behavior and tumorigenesis in bladder cancer. J Transl Med. 2018. https://doi.org/10.1186/s12967-017-1377-3. Ciró M, Prosperini E, Quarto M, Grazini U, Walfridsson J, McBlane F, Nucifero P, Pacchiana G, Capra M, Christensen J, Helin K. ATAD2 is a novel cofactor for MYC, overexpressed and amplified in aggressive tumors. Cancer Res. 2009. https://doi.org/10.1158/0008-5472.CAN-09-2131. Wang L, Jia YP, Jiang ZY, Gao W, Wang BQ. FSCN1 is upregulated by SNAI2 and promotes epithelial to mesenchymal transition in head and neck squamous cell carcinoma. Cell Biol Int. 2017. https://doi.org/10.1002/cbin.10786. Patel V, Adhil M, Bhardwaj T, Talukder AK. Big data analytics of genomic and clinical data for Diagnosis and Prognosis of Cancer. In: 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom). New Delhi: IEEE: 2015. Hou Y, Gao B, Li G, Su Z. MaxMIF: A New Method for Identifying Cancer Driver Genes through Effective Data Integration. Adv Sci. 2018. https://doi.org/10.1002/advs.201800640. Wu K, Li Z, Cai S, Tian L, Chen K, Wang J, Hu J, Sun Y, Li X, Ertel A. Eya1 phosphatase function is essential to drive breast cancer cell proliferation through cyclin d1. Cancer Res. 2013; 73(14):4488. https://doi.org/10.1158/0008-5472.CAN-12-4078. Blevins MA, Towers CG, Patrick AN, Zhao R, Ford HL. The six1-eya transcriptional complex as a therapeutic target in cancer. Expert Opin Ther Targets. 2015; 19(2):213. https://doi.org/10.1517/14728222.2014.978860. Rives AW, Galitski T. Modular organization of cellular networks. Proc Natl Acad Sci. 2003. https://doi.org/10.1073/pnas.0237338100. Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, et al. String v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2014; 43(D1):447–52. https://doi.org/10.1093/nar/gku1003. Nabavi S, Schmolze D, Maitituoheti M, Malladi S, Beck AH. EMDomics: A robust and powerful method for the identification of genes differentially expressed between heterogeneous classes. Bioinformatics. 2016. https://doi.org/10.1093/bioinformatics/btv634. Joshi A, Van de peer Y, Michoel T. Analysis of a Gibbs sampler method for model-based clustering of gene expression data. Bioinformatics. 2008. https://doi.org/10.1093/bioinformatics/btm562. Therneau TM. A Package for Survival Analysis in S. Version 2.38. 2015. CRAN website - http://cran.r-project.org/package=survival. Accessed 12 June 2015. Maere S, Heymans K, Kuiper M. BiNGO: a Cytoscape Plugin to Assess Overrepresentation of Gene Ontology Categories in Biological Networks: Oxford University Press; 2005, pp. 3448–9. https://doi.org/10.1093/bioinformatics/bti551. Hartigan JA, Wong MA. Algorithm AS 136: A K-Means Clustering Algorithm. Appl Stat. 2006. https://doi.org/10.2307/2346830. Karatzoglou A, Smola A, Hornik K, Zeileis A. kernlab - An S4 Package for Kernel Methods in R. J Stat Softw. 2015. https://doi.org/10.18637/jss.v011.i09. Clausel M, Grégoire G. Practical Session: Introduction to R. EAS Publ Ser. 2014; 66:11–18. https://doi.org/10.1051/eas/1466002. Karatzoglou A, Smola A, Hornik K, et al.Kernlab: Kernel-based machine learning lab. Version 0.9. 2016. CRAN website - https://cran.r-project.org/web/packages/kernlab.