iOPTICS-GSO for identifying protein complexes from dynamic PPI networks

BMC Medical Genomics - Tập 10 Số 5 - Trang 55-66 - 2017
Lei, Xiujuan1, Li, Huan1, Zhang, Aidong2, Wu, Fang-Xiang3,4
1School of Computer Science, Shaanxi Normal University, Xi’an, China
2Department of Computer Science and Engineering, State University of New York at Buffalo, NY, USA
3School of Mathematical Sciences, Nankai University, Tianjin, China
4Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, Canada

Tóm tắt

Identifying protein complexes plays an important role for understanding cellular organization and functional mechanisms. As plenty of evidences have indicated that dense sub-networks in dynamic protein-protein interaction network (DPIN) usually correspond to protein complexes, identifying protein complexes is formulated as density-based clustering. In this paper, a new approach named iOPTICS-GSO is developed, which is the improved Ordering Points to Identify the Clustering Structure (OPTICS) algorithm with Glowworm swarm optimization algorithm (GSO) to optimize the parameters in OPTICS when finding dense sub-networks. In our iOPTICS-GSO, the concept of core node is redefined and the Euclidean distance in OPTICS is replaced with the improved similarity between the nodes in the PPI network according to their interaction strength, and dense sub-networks are considered as protein complexes. The experiment results have shown that our iOPTICS-GSO outperforms of algorithms such as DBSCAN, CFinder, MCODE, CMC, COACH, ClusterOne MCL and OPTICS_PSO in terms of f-measure and p-value on four DPINs, which are from the DIP, Krogan, MIPS and Gavin datasets. In addition, our predicted protein complexes have a small p-value and thus are highly likely to be true protein complexes. The proposed iOPTICS-GSO gains optimal clustering results by adopting GSO algorithm to optimize the parameters in OPTICS, and the result on four datasets shows superior performance. What’s more, the results provided clues for biologists to verify and find new protein complexes.

Tài liệu tham khảo

citation_journal_title=Nature; citation_title=Functional organization of the yeast proteome by systematic analysis of protein complexes; citation_author=AC Gavin, M Bösche, R Krause, P Grandi, M Marzioch, A Bauer, J Schultz, M Rick, AM Michon, CM Cruciat, M Remor, C Höfert, M Schelder, M Brajenovic, H Ruffner, A Merino, K Klein, M Hudak, D Dickson, T Rudi, V Gnau, A Bauch, S Bastuck, B Huhse, C Leutwein, MA Heurtier, RR Copley, A Edelmann, E Querfurth, V Rybin, G Drewes, M Raida, T Bouwmeester, P Bork, B Seraphin, B Kuster, G Neubauer, G Superti-Furga; citation_volume=415; citation_issue=6868; citation_publication_date=2002; citation_pages=141-147; citation_doi=10.1038/415141a; citation_id=CR1 Kazemipour A, Goliaei B, Pezeshk H. Protein complex discovery by interaction filtering from protein interaction networks using mutual rank Coexpression and sequence similarity. Biomed Res Int. 2015;2015. Article ID 165186:1–7. citation_journal_title=Nat Biotechnol; citation_title=A human phenome-interactome network of protein complexes implicated in genetic disorders; citation_author=K Lage, EO Karlberg, ZM Størling, PÍ Ólason, AG Pedersen, O Rigina, AM Hinsby, Z Tümer, F Pociot, N Tommerup, Y Moreau, S Brunak; citation_volume=25; citation_issue=3; citation_publication_date=2007; citation_pages=309-316; citation_doi=10.1038/nbt1295; citation_id=CR3 citation_journal_title=BMC Med Genet; citation_title=Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection; citation_author=ZH Yang, FY Yu, HF Lin, J Wang; citation_volume=7; citation_issue=2; citation_publication_date=2014; citation_pages=S3; citation_id=CR4 citation_journal_title=BMC Bioinform.; citation_title=Temporal dynamics of protein complexes in PPI networks: a case study using yeast cell cycle dynamics; citation_author=S Srihari, HW Leong; citation_volume=13; citation_issue=17; citation_publication_date=2012; citation_pages=824-834; citation_id=CR5 citation_journal_title=Methods; citation_title=Effective identification of essential proteins based on priori knowledge, network topology and gene expressions; citation_author=M Li, RQ Zheng, HH Zhang, JX Wang, Y Pan; citation_volume=67; citation_publication_date=2014; citation_pages=325-333; citation_doi=10.1016/j.ymeth.2014.02.016; citation_id=CR6 citation_title=Finding Groups in Data: An Introduction to Cluster Analysis; citation_publication_date=1990; citation_id=CR7; citation_author=L Kaufman; citation_author=PJ Rousseeuw citation_journal_title=Pattern Recogn Lett; citation_title=A grid-clustering algorithm for high-dimensional very large spatial data bases; citation_author=AH Pilevar, M Sukumar; citation_volume=26; citation_issue=7; citation_publication_date=2005; citation_pages=999-1010; citation_doi=10.1016/j.patrec.2004.09.052; citation_id=CR8 Ester M, Kriegel HP, Sander J, Xu XW. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining. Menlo Park: The AAAI Press; 1996. p. 226–31. citation_journal_title=ACM SIGMOD Rec; citation_title=OPTICS: ordering points to identify the clustering structure; citation_author=M Ankerst, M Breunig, H Kriegel, J Sander; citation_volume=28; citation_issue=2; citation_publication_date=1999; citation_pages=49-60; citation_doi=10.1145/304181.304187; citation_id=CR10 Holland JH. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. Quarterly Review of Biology. 1975;6(2):126–137. citation_title=Particle swarm optimization; citation_inbook_title=Proceeding of the IEEE international conference on neural networks; citation_publication_date=1995; citation_pages=1942-1948; citation_id=CR12; citation_author=J Kennedy; citation_author=R Eberhart citation_title=Detection of multiple source locations using a glowworm metaphor with applications to collective robotics; citation_publication_date=2005; citation_id=CR13; citation_author=KN Krishnanand; citation_author=D Ghose; citation_publisher=IEEE Swarm Intelligence Sysposium citation_journal_title=Bioinformatics; citation_title=CFinder: locating cliques and overlapping modules in biological networks; citation_author=B Adamcsek, G Palla, IJ Farkas, I Derényi, T Vicsek; citation_volume=22; citation_issue=8; citation_publication_date=2006; citation_pages=1021-1023; citation_doi=10.1093/bioinformatics/btl039; citation_id=CR14 citation_journal_title=BMC Bioinform.; citation_title=An automated method for finding molecular complexes in large protein interaction networks; citation_author=GD Bader, CW Hogue; citation_volume=4; citation_publication_date=2003; citation_pages=1-27; citation_doi=10.1186/1471-2105-4-2; citation_id=CR15 citation_journal_title=Bioinformatics; citation_title=Complex discovery from weighted PPI networks; citation_author=G Liu, L Wong, H Chua; citation_volume=25; citation_issue=15; citation_publication_date=2009; citation_pages=1891-1897; citation_doi=10.1093/bioinformatics/btp311; citation_id=CR16 citation_journal_title=BMC Bioinform.; citation_title=A core-attachment based method to detect protein complexes in PPI networks; citation_author=M Wu, X Li, C Kwoh, SK Ng; citation_volume=10; citation_issue=1; citation_publication_date=2009; citation_pages=1-16; citation_doi=10.1186/1471-2105-10-1; citation_id=CR17 citation_journal_title=Nat Methods; citation_title=Detecting overlapping protein complexes in protein-protein interaction networks; citation_author=T Nepusz, H Yu, H Paccanaro; citation_volume=9; citation_issue=5; citation_publication_date=2012; citation_pages=471-472; citation_doi=10.1038/nmeth.1938; citation_id=CR18 Dongen BSV. Graph clustering by flow simulation. Dissertation for doctoral degree, Center for Math and Computer Science (CWI). Utrecht: University of Utrecht; 2000. Lei XJ, Li H, Wu Fang-Xiang. Detecting Protein Complexes from DPINs by OPTICS Based on Particle Swarm Optimization. 2016 IEEE International Conference on Bioinformatics andBiomedicine. Shenzhen, China. 2016;1814–21. Shi BY, Eberhart R. A modified particle swarm optimizer. Proceedings of the IEEE Congress on Evolutionary Computation. Anchorage: IEEE; 1998:303–8. citation_journal_title=Int Joint Conf Artif Intell (IJCAI); citation_title=Understanding belief Propa- gation and its generalizations; citation_author=J Yedidia, WT Freeman, Y Weiss; citation_volume=54; citation_issue=1; citation_publication_date=2001; citation_pages=276-286; citation_id=CR22 citation_journal_title=BMC Bioinform.; citation_title=Predicting protein function from protein-protein interaction data: a probabilistic approach; citation_author=S Letovsky, S Kasif; citation_volume=19; citation_issue=6; citation_publication_date=2003; citation_pages=197-204; citation_doi=10.1093/bioinformatics/btg1026; citation_id=CR23 citation_journal_title=Nucleic Acids Res; citation_title=DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions; citation_author=I Xenarios, L Salwnski, XJ Duan, P Higney, SM Kim, D Eisenberg; citation_volume=30; citation_issue=1; citation_publication_date=2002; citation_pages=303-305; citation_doi=10.1093/nar/30.1.303; citation_id=CR24 citation_journal_title=Nature; citation_title=Global landscape of protein complexes in the yeast Saccharomyces Cerevisiae; citation_author=NJ Krogan, G Cagney, H Yu, G Zhong, X Guo, A Ignatchenko, J Li, S Pu, N Datta, AP Tikuisis; citation_volume=440; citation_issue=7084; citation_publication_date=2006; citation_pages=637-643; citation_doi=10.1038/nature04670; citation_id=CR25 citation_journal_title=Nucleic Acids Res; citation_title=MPact: the MIPS protein interaction resource on yeast; citation_author=U Güldener, M Münsterkötter, M Oesterheld, P Pagel, A Ruepp, HW Mewes, V Stümpflen; citation_volume=34; citation_publication_date=2006; citation_pages=D436-D441; citation_doi=10.1093/nar/gkj003; citation_id=CR26 citation_journal_title=Nature; citation_title=Proteome survey reveals modularity of the yeast cell machinery; citation_author=AC Gavin, P Aloy, P Grandi, R Krause, M Boesche, M Marzioch, C Rau, LJ Jensen, S Bastuck, B Dümpelfeld, A Edelmann, MA Heurtier, V Hoffman, C Hoefert, K Klein, M Hudak, AM Michon, M Schelder, M Schirle, M Remor, T Rudi, S Hooper, A Bauer, T Bouwmeester, G Casari, G Drewes, G Neubauer, JM Rick, B Kuster, P Bork, RB Russell, GS Furga; citation_volume=440; citation_issue=7084; citation_publication_date=2006; citation_pages=631-636; citation_doi=10.1038/nature04532; citation_id=CR27 citation_journal_title=Nucleic Acids Res; citation_title=Up-to-date catalogues of yeast protein complexes; citation_author=S Pu, J Wong, B Turner, E Cho, SJ Wodak; citation_volume=37; citation_issue=3; citation_publication_date=2009; citation_pages=825-831; citation_doi=10.1093/nar/gkn1005; citation_id=CR28 citation_journal_title=Inf Sci; citation_title=Protein complex identification through Markov clustering with firefly algorithm on dynamic protein-protein interaction networks; citation_author=XJ Lei, F Wang, FX Wu, AD Zhang, W Pedrycz; citation_volume=329; citation_publication_date=2016; citation_pages=303-316; citation_doi=10.1016/j.ins.2015.09.028; citation_id=CR29 citation_journal_title=Science; citation_title=Logic of the yeast metabolic cycle: temporal compart mentalization of cellular processes; citation_author=BP Tu, A Kudlicki, M Rowicka, SL McKnight; citation_volume=310; citation_publication_date=2005; citation_pages=1152-1158; citation_doi=10.1126/science.1120499; citation_id=CR30 citation_title=Protein interaction networks: computational analysis; citation_publication_date=2009; citation_id=CR31; citation_author=AD Zhang; citation_publisher=Cambridge University Press citation_journal_title=BMC Bioinform; citation_title=Evaluation of clustering algorithms for protein–protein interaction network; citation_author=S Brohée, JV Helden; citation_volume=7; citation_issue=1; citation_publication_date=2006; citation_pages=1-19; citation_doi=10.1186/1471-2105-7-488; citation_id=CR32 citation_title=Bootstrapping the interactome: unsupervised identification of protein complexes in yeast. In; citation_inbook_title=Proceedings of the 12th annual conference on research in computational molecular biology (RECOMB); citation_publication_date=2008; citation_pages=3-16; citation_id=CR33; citation_author=CC Friedel; citation_author=J Krumsiek; citation_author=R Zimmer citation_journal_title=BMC Med Genet; citation_title=Identification and characterization of alternative exon usage linked glioblastoma multiforme survival; citation_author=A Sadeque, NV Serão, BR Southey, KR Delfino, SL Rodriguez-Zas; citation_volume=5; citation_issue=1; citation_publication_date=2012; citation_pages=59; citation_id=CR34 citation_journal_title=BMC Bioinformatics; citation_title=Development and implementation of an algorithm for detection ofprotein complexes in large interaction networks; citation_author=M Altaf-Ul-Amin, Y Shinbo, K Mihara, K Kurokawa, S Kanaya; citation_volume=7; citation_publication_date=2006; citation_pages=207-219; citation_doi=10.1186/1471-2105-7-207; citation_id=CR35