Exact p-values for global network alignments via combinatorial analysis of shared GO terms

Wayne B. Hayes1
1Department of Computer Science, UC Irvine, Irvine, USA

Tóm tắt

Abstract

Network alignment aims to uncover topologically similar regions in the protein–protein interaction (PPI) networks of two or more species under the assumption that topologically similar regions tend to perform similar functions. Although there exist a plethora of both network alignment algorithms and measures of topological similarity, currently no “gold standard” exists for evaluating how well either is able to uncover functionally similar regions. Here we propose a formal, mathematically and statistically rigorous method for evaluating the statistical significance of shared GO terms in a global, 1-to-1 alignment between two PPI networks. Given an alignment in which k aligned protein pairs share a particular GO term g, we use a combinatorial argument to precisely quantify the p-value of that alignment with respect to g compared to a random alignment. The p-value of the alignment with respect to all GO terms, including their inter-relationships, is approximated using the Empirical Brown’s Method. We note that, just as with BLAST’s p-values, this method is not designed to guide an alignment algorithm towards a solution; instead, just as with BLAST, an alignment is guided by a scoring matrix or function; the p-values herein are computed after the fact, providing independent feedback to the user on the biological quality of the alignment that was generated by optimizing the scoring function. Importantly, we demonstrate that among all GO-based measures of network alignments, ours is the only one that correlates with the precision of GO annotation predictions, paving the way for network alignment-based protein function prediction.

Từ khóa


Tài liệu tham khảo

Aladağ AE, Erten C (2013) SPINAL: scalable protein interaction network alignment. Bioinformatics 29(7):917. https://doi.org/10.1093/bioinformatics/btt071

Alkan F, Erten C (2014) BEAMS: backbone extraction and merge strategy for the global many-to-many alignment of multiple PPI networks. Bioinformatics 30(4):531

Alkan F, Erten C (2015) SiPAN: simultaneous prediction and alignment of protein-protein interaction networks. Bioinformatics 31(14):2356

Altschul SF, Gish W, Miller W, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403

Balomenos AD, Tsakanikas P, Manolakos ES (2015) Tracking single-cells in overcrowded bacterial colonies. In: 2015 37th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 6473–6476. https://doi.org/10.1109/EMBC.2015.7319875

Chindelevitch L, Ma CY, Liao CS, Berger B (2013) Optimizing a global alignment of protein interaction networks. Bioinformatics 29(21):2765. https://doi.org/10.1093/bioinformatics/btt486

Clark C, Kalita J (2014) A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics 30(16):2351

Clark C, Kalita J (2015) A multiobjective memetic algorithm for PPI network alignment. Bioinformatics 31(12):1988. https://doi.org/10.1093/bioinformatics/btv063

Crawford J, Sun Y, Milenković T (2015) Fair evaluation of global network aligners. Algorithms Mol Biol 10(1):1

Djeddi WE, Yahia SB, Nguifo EM (2018) A novel computational approach for global alignment for multiple biological networks. IEEE/ACM Trans Comput Biol Bioinform 15(6):2060

Elmsallati A, Msalati A, Kalita J (2018) Index-based network aligner of protein-protein interaction networks. IEEE/ACM Trans Comput Biol Bioinform TCBB 15(1):330

Faisal FE, Meng L, Crawford J, Milenković T (2015) The post-genomic era of biological network alignment. EURASIP J Bioinf Syst Biol 2015(1):3

Fan J, Cannistra A, Fried I, Lim T, Schaffner T, Crovella M, Hescott B, Leiserson MD (2019) Functional protein representations from biological networks enable diverse cross-species inference. Nucleic Acids Res 47(9):e51

Flannick J, Novak A, Srinivasan BS, McAdams HH, Batzoglou S (2006) Graemlin: general and robust alignment of multiple large interaction networks. Genome Res 16(9):1169. https://doi.org/10.1101/gr.5235706

Gligorijević V, Malod-Dognin N, Pržulj N (2015) FUSE: multiple network alignment via data fusion. Bioinformatics btv731

Gong M, Peng Z, Ma L, Huang J (2015) Global biological network alignment by using efficient memetic algorithm. IEEE/ACM Trans Comput Biol Bioinf 13(6):1117

Guzzi PH, Milenković T (2017) Survey of local and global biological network alignment: the need to reconcile the two sides of the same coin. Brief Bioinform bbw132

Guzzi PH, Mina M, Guerra C, Cannataro M (2012) Semantic similarity analysis of protein data: assessment with biological features and issues. Brief Bioinform 13(5):569

Harispe S, Ranwez S, Janaqi S, Montmain J (2015) Semantic similarity from natural language and ontology analysis. Synth Lect Hum Lang Technol 8(1):1

Hashemifar S, Xu J (2014) HubAlign: an accurate and efficient method for global alignment of protein-protein interaction networks. Bioinformatics 30(17):i438. https://doi.org/10.1093/bioinformatics/btu450

Hashemifar S, Ma J, Naveed H, Canzar S, Xu J (2016) ModuleAlign: module-based global alignment of protein-protein interaction networks. Bioinformatics 32(17):i658

Hashemifar S, Huang Q, Xu J (2016) Joint alignment of multiple protein-protein interaction networks via convex optimization. J Comput Biol 23(11):903

Hu J, Kehr B, Reinert K (2014) NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks. Bioinformatics 30(4):540. https://doi.org/10.1093/bioinformatics/btt715

Kalecky K, Cho YR (2018) PrimAlign: PageRank-inspired Markovian alignment for large biological networks. Bioinformatics 34(13):i537

Kazemi E, Hassani H, Grossglauser M, Modarres HP (2016) PROPER: global protein interaction network alignment through percolation matching. BMC Bioinform 17(1):527

Kuchaiev O, Pržulj N (2011) Integrative network alignment reveals large regions of global network similarity in yeast and human. Bioinformatics 27:1390. https://doi.org/10.1093/bioinformatics/btr127

Kuchaiev O, Milenković T, Memišević V, Hayes W, Pržulj N (2010) Topological network alignment uncovers biological function and phylogeny. J R Soc Interface 7(50):1341. https://doi.org/10.1098/rsif.2010.0063

Liao CS, Lu K, Baym M, Singh R, Berger B (2009) IsoRankN: spectral methods for global alignment of multiple protein networks. Bioinformatics 25(12):i253–i258

Malod-Dognin N, Pržulj N (2015) L-GRAAL: Lagrangian graphlet-based network aligner. Bioinformatics. https://doi.org/10.1093/bioinformatics/btv130

Malod-Dognin N, Ban K, Pržulj N (2017) Unified alignment of protein-protein interaction networks. Sci Rep 7(1):953

Mamano N, Hayes WB (2017) SANA: simulated annealing far outperforms many other search algorithms for biological network alignment. Bioinformatics 33:2156

Milano M, Guzzi PH, Cannataro M (2018) Glalign: A novel algorithm for local network alignment. IEEE/ACM Trans Comput Biol Bioinf 16(6):1958

Milenković T, Ng WL, Hayes W, Pržulj N (2010) Optimal network alignment with graphlet degree vectors. Cancer Inform 9:121. https://doi.org/10.4137/CIN.S4744

Mir A, Naghibzadeh M, Saadati N (2017) INDEX: incremental depth extension approach for protein-protein interaction networks alignment. Biosystems 162:24

Mistry M, Pavlidis P (2008) Gene Ontology term overlap as a measure of gene functional similarity. BMC Bioinform 9(1):327

Neyshabur B, Khadem A, Hashemifar S, Arab SS (2013) NETAL: a new graph-based method for global alignment of protein-protein interaction networks. Bioinformatics 29(13):1654. https://doi.org/10.1093/bioinformatics/btt202

Patro R, Kingsford C (2012) Global network alignment using multiscale spectral signatures. Bioinformatics 28(23):3105. https://doi.org/10.1093/bioinformatics/bts592

Pesquita C, Faria D, Bastos H, Ferreira AE, Falcão AO, Couto FM (2008) Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinform 9(5):S4

Pesquita C, Faria D, Falcao AO, Lord P, Couto FM (2009) Semantic similarity in biomedical ontologies. PLoS Comput Biol 5(7):e1000443

Poole W, Gibbs DL, Shmulevich I, Bernard B, Knijnenburg TA (2016) Combining dependent P-values with an empirical adaptation of Brown’s method. Bioinformatics 32(17):i430

Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence—volume 1, IJCAI’95. Morgan Kaufmann Publishers Inc., San Francisco, pp 448–453. http://dl.acm.org/citation.cfm?id=1625855.1625914

Resnik P et al (1999) Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artif Intell Res JAIR 11:95

Sarajlić A, Malod-Dognin N, Yaveroğlu ÖN, Pržulj N (2016) Graphlet-based characterization of directed networks. Sci Rep 6:35098

Saraph V, Milenković T (2014) MAGNA: maximizing accuracy in global network alignment. Bioinformatics 30(20):2931

Schlicker A, Domingues FS, Rahnenführer J, Lengauer T (2006) A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinform 7(1):302

Singh R, Xu J, Berger B (2008) Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc Natl Acad Sci 105(35):12763. https://doi.org/10.1073/pnas.0806627105

Sun Y, Crawford J, Tang J, Milenkovic T (2015) Simultaneous optimization of both node and edge conservation in network alignment via WAVE. In: Pop M, Touzet H (eds) Algorithms in bioinformatics. Lecture notes in computer science, vol 9289. Springer, Berlin, pp 16–39. https://doi.org/10.1007/978-3-662-48221-6_2

The Gene Ontology Consortium (2008) Nucleic Acids Res 36(suppl 1):D440

Vijayan V, Milenković T (2018) Multiple network alignment via multiMAGNA++. IEEE/ACM Trans Comput Biol Bioinform 1:25. https://doi.org/10.1109/TCBB.2017.2740381

Wang S, Atkinson GR, Hayes WB (2022) SANA: cross-species prediction of Gene Ontology GO annotations via topological network alignment. Nat Partner J Syst Biol Appl 8(1):25

Wang S, Chen X, Frederisy BJ, Mbakogu BA, Kanne AD, Khosravi P, Hayes WB (2022) On the current failure—but bright future—of topology-driven biological network alignment. Protein Interact Netw 21(1)

Xie J, Xiang C, Ma J, Tan J, Wen T, Lei J, Nie Q (2016) An adaptive hybrid algorithm for global network alignment. IEEE/ACM Trans Comput Biol Bioinform TCBB 13(3):483

Zhu Y, Li Y, Liu J, Qin L, Yu JX (2017) GMAlign: a new network aligner for revealing large conserved functional components. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM) (IEEE), pp 120–127