iOPTICS-GSO for identifying protein complexes from dynamic PPI networks
Tóm tắt
Identifying protein complexes plays an important role for understanding cellular organization and functional mechanisms. As plenty of evidences have indicated that dense sub-networks in dynamic protein-protein interaction network (DPIN) usually correspond to protein complexes, identifying protein complexes is formulated as density-based clustering. In this paper, a new approach named iOPTICS-GSO is developed, which is the improved Ordering Points to Identify the Clustering Structure (OPTICS) algorithm with Glowworm swarm optimization algorithm (GSO) to optimize the parameters in OPTICS when finding dense sub-networks. In our iOPTICS-GSO, the concept of core node is redefined and the Euclidean distance in OPTICS is replaced with the improved similarity between the nodes in the PPI network according to their interaction strength, and dense sub-networks are considered as protein complexes. The experiment results have shown that our iOPTICS-GSO outperforms of algorithms such as DBSCAN, CFinder, MCODE, CMC, COACH, ClusterOne MCL and OPTICS_PSO in terms of f-measure and p-value on four DPINs, which are from the DIP, Krogan, MIPS and Gavin datasets. In addition, our predicted protein complexes have a small p-value and thus are highly likely to be true protein complexes. The proposed iOPTICS-GSO gains optimal clustering results by adopting GSO algorithm to optimize the parameters in OPTICS, and the result on four datasets shows superior performance. What’s more, the results provided clues for biologists to verify and find new protein complexes.
Tài liệu tham khảo
citation_journal_title=Nature; citation_title=Functional organization of the yeast proteome by systematic analysis of protein complexes; citation_author=AC Gavin, M Bösche, R Krause, P Grandi, M Marzioch, A Bauer, J Schultz, M Rick, AM Michon, CM Cruciat, M Remor, C Höfert, M Schelder, M Brajenovic, H Ruffner, A Merino, K Klein, M Hudak, D Dickson, T Rudi, V Gnau, A Bauch, S Bastuck, B Huhse, C Leutwein, MA Heurtier, RR Copley, A Edelmann, E Querfurth, V Rybin, G Drewes, M Raida, T Bouwmeester, P Bork, B Seraphin, B Kuster, G Neubauer, G Superti-Furga; citation_volume=415; citation_issue=6868; citation_publication_date=2002; citation_pages=141-147; citation_doi=10.1038/415141a; citation_id=CR1
Kazemipour A, Goliaei B, Pezeshk H. Protein complex discovery by interaction filtering from protein interaction networks using mutual rank Coexpression and sequence similarity. Biomed Res Int. 2015;2015. Article ID 165186:1–7.
citation_journal_title=Nat Biotechnol; citation_title=A human phenome-interactome network of protein complexes implicated in genetic disorders; citation_author=K Lage, EO Karlberg, ZM Størling, PÍ Ólason, AG Pedersen, O Rigina, AM Hinsby, Z Tümer, F Pociot, N Tommerup, Y Moreau, S Brunak; citation_volume=25; citation_issue=3; citation_publication_date=2007; citation_pages=309-316; citation_doi=10.1038/nbt1295; citation_id=CR3
citation_journal_title=BMC Med Genet; citation_title=Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection; citation_author=ZH Yang, FY Yu, HF Lin, J Wang; citation_volume=7; citation_issue=2; citation_publication_date=2014; citation_pages=S3; citation_id=CR4
citation_journal_title=BMC Bioinform.; citation_title=Temporal dynamics of protein complexes in PPI networks: a case study using yeast cell cycle dynamics; citation_author=S Srihari, HW Leong; citation_volume=13; citation_issue=17; citation_publication_date=2012; citation_pages=824-834; citation_id=CR5
citation_journal_title=Methods; citation_title=Effective identification of essential proteins based on priori knowledge, network topology and gene expressions; citation_author=M Li, RQ Zheng, HH Zhang, JX Wang, Y Pan; citation_volume=67; citation_publication_date=2014; citation_pages=325-333; citation_doi=10.1016/j.ymeth.2014.02.016; citation_id=CR6
citation_title=Finding Groups in Data: An Introduction to Cluster Analysis; citation_publication_date=1990; citation_id=CR7; citation_author=L Kaufman; citation_author=PJ Rousseeuw
citation_journal_title=Pattern Recogn Lett; citation_title=A grid-clustering algorithm for high-dimensional very large spatial data bases; citation_author=AH Pilevar, M Sukumar; citation_volume=26; citation_issue=7; citation_publication_date=2005; citation_pages=999-1010; citation_doi=10.1016/j.patrec.2004.09.052; citation_id=CR8
Ester M, Kriegel HP, Sander J, Xu XW. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining. Menlo Park: The AAAI Press; 1996. p. 226–31.
citation_journal_title=ACM SIGMOD Rec; citation_title=OPTICS: ordering points to identify the clustering structure; citation_author=M Ankerst, M Breunig, H Kriegel, J Sander; citation_volume=28; citation_issue=2; citation_publication_date=1999; citation_pages=49-60; citation_doi=10.1145/304181.304187; citation_id=CR10
Holland JH. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. Quarterly Review of Biology. 1975;6(2):126–137.
citation_title=Particle swarm optimization; citation_inbook_title=Proceeding of the IEEE international conference on neural networks; citation_publication_date=1995; citation_pages=1942-1948; citation_id=CR12; citation_author=J Kennedy; citation_author=R Eberhart
citation_title=Detection of multiple source locations using a glowworm metaphor with applications to collective robotics; citation_publication_date=2005; citation_id=CR13; citation_author=KN Krishnanand; citation_author=D Ghose; citation_publisher=IEEE Swarm Intelligence Sysposium
citation_journal_title=Bioinformatics; citation_title=CFinder: locating cliques and overlapping modules in biological networks; citation_author=B Adamcsek, G Palla, IJ Farkas, I Derényi, T Vicsek; citation_volume=22; citation_issue=8; citation_publication_date=2006; citation_pages=1021-1023; citation_doi=10.1093/bioinformatics/btl039; citation_id=CR14
citation_journal_title=BMC Bioinform.; citation_title=An automated method for finding molecular complexes in large protein interaction networks; citation_author=GD Bader, CW Hogue; citation_volume=4; citation_publication_date=2003; citation_pages=1-27; citation_doi=10.1186/1471-2105-4-2; citation_id=CR15
citation_journal_title=Bioinformatics; citation_title=Complex discovery from weighted PPI networks; citation_author=G Liu, L Wong, H Chua; citation_volume=25; citation_issue=15; citation_publication_date=2009; citation_pages=1891-1897; citation_doi=10.1093/bioinformatics/btp311; citation_id=CR16
citation_journal_title=BMC Bioinform.; citation_title=A core-attachment based method to detect protein complexes in PPI networks; citation_author=M Wu, X Li, C Kwoh, SK Ng; citation_volume=10; citation_issue=1; citation_publication_date=2009; citation_pages=1-16; citation_doi=10.1186/1471-2105-10-1; citation_id=CR17
citation_journal_title=Nat Methods; citation_title=Detecting overlapping protein complexes in protein-protein interaction networks; citation_author=T Nepusz, H Yu, H Paccanaro; citation_volume=9; citation_issue=5; citation_publication_date=2012; citation_pages=471-472; citation_doi=10.1038/nmeth.1938; citation_id=CR18
Dongen BSV. Graph clustering by flow simulation. Dissertation for doctoral degree, Center for Math and Computer Science (CWI). Utrecht: University of Utrecht; 2000.
Lei XJ, Li H, Wu Fang-Xiang. Detecting Protein Complexes from DPINs by OPTICS Based on Particle Swarm Optimization. 2016 IEEE International Conference on Bioinformatics andBiomedicine. Shenzhen, China. 2016;1814–21.
Shi BY, Eberhart R. A modified particle swarm optimizer. Proceedings of the IEEE Congress on Evolutionary Computation. Anchorage: IEEE; 1998:303–8.
citation_journal_title=Int Joint Conf Artif Intell (IJCAI); citation_title=Understanding belief Propa- gation and its generalizations; citation_author=J Yedidia, WT Freeman, Y Weiss; citation_volume=54; citation_issue=1; citation_publication_date=2001; citation_pages=276-286; citation_id=CR22
citation_journal_title=BMC Bioinform.; citation_title=Predicting protein function from protein-protein interaction data: a probabilistic approach; citation_author=S Letovsky, S Kasif; citation_volume=19; citation_issue=6; citation_publication_date=2003; citation_pages=197-204; citation_doi=10.1093/bioinformatics/btg1026; citation_id=CR23
citation_journal_title=Nucleic Acids Res; citation_title=DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions; citation_author=I Xenarios, L Salwnski, XJ Duan, P Higney, SM Kim, D Eisenberg; citation_volume=30; citation_issue=1; citation_publication_date=2002; citation_pages=303-305; citation_doi=10.1093/nar/30.1.303; citation_id=CR24
citation_journal_title=Nature; citation_title=Global landscape of protein complexes in the yeast Saccharomyces Cerevisiae; citation_author=NJ Krogan, G Cagney, H Yu, G Zhong, X Guo, A Ignatchenko, J Li, S Pu, N Datta, AP Tikuisis; citation_volume=440; citation_issue=7084; citation_publication_date=2006; citation_pages=637-643; citation_doi=10.1038/nature04670; citation_id=CR25
citation_journal_title=Nucleic Acids Res; citation_title=MPact: the MIPS protein interaction resource on yeast; citation_author=U Güldener, M Münsterkötter, M Oesterheld, P Pagel, A Ruepp, HW Mewes, V Stümpflen; citation_volume=34; citation_publication_date=2006; citation_pages=D436-D441; citation_doi=10.1093/nar/gkj003; citation_id=CR26
citation_journal_title=Nature; citation_title=Proteome survey reveals modularity of the yeast cell machinery; citation_author=AC Gavin, P Aloy, P Grandi, R Krause, M Boesche, M Marzioch, C Rau, LJ Jensen, S Bastuck, B Dümpelfeld, A Edelmann, MA Heurtier, V Hoffman, C Hoefert, K Klein, M Hudak, AM Michon, M Schelder, M Schirle, M Remor, T Rudi, S Hooper, A Bauer, T Bouwmeester, G Casari, G Drewes, G Neubauer, JM Rick, B Kuster, P Bork, RB Russell, GS Furga; citation_volume=440; citation_issue=7084; citation_publication_date=2006; citation_pages=631-636; citation_doi=10.1038/nature04532; citation_id=CR27
citation_journal_title=Nucleic Acids Res; citation_title=Up-to-date catalogues of yeast protein complexes; citation_author=S Pu, J Wong, B Turner, E Cho, SJ Wodak; citation_volume=37; citation_issue=3; citation_publication_date=2009; citation_pages=825-831; citation_doi=10.1093/nar/gkn1005; citation_id=CR28
citation_journal_title=Inf Sci; citation_title=Protein complex identification through Markov clustering with firefly algorithm on dynamic protein-protein interaction networks; citation_author=XJ Lei, F Wang, FX Wu, AD Zhang, W Pedrycz; citation_volume=329; citation_publication_date=2016; citation_pages=303-316; citation_doi=10.1016/j.ins.2015.09.028; citation_id=CR29
citation_journal_title=Science; citation_title=Logic of the yeast metabolic cycle: temporal compart mentalization of cellular processes; citation_author=BP Tu, A Kudlicki, M Rowicka, SL McKnight; citation_volume=310; citation_publication_date=2005; citation_pages=1152-1158; citation_doi=10.1126/science.1120499; citation_id=CR30
citation_title=Protein interaction networks: computational analysis; citation_publication_date=2009; citation_id=CR31; citation_author=AD Zhang; citation_publisher=Cambridge University Press
citation_journal_title=BMC Bioinform; citation_title=Evaluation of clustering algorithms for protein–protein interaction network; citation_author=S Brohée, JV Helden; citation_volume=7; citation_issue=1; citation_publication_date=2006; citation_pages=1-19; citation_doi=10.1186/1471-2105-7-488; citation_id=CR32
citation_title=Bootstrapping the interactome: unsupervised identification of protein complexes in yeast. In; citation_inbook_title=Proceedings of the 12th annual conference on research in computational molecular biology (RECOMB); citation_publication_date=2008; citation_pages=3-16; citation_id=CR33; citation_author=CC Friedel; citation_author=J Krumsiek; citation_author=R Zimmer
citation_journal_title=BMC Med Genet; citation_title=Identification and characterization of alternative exon usage linked glioblastoma multiforme survival; citation_author=A Sadeque, NV Serão, BR Southey, KR Delfino, SL Rodriguez-Zas; citation_volume=5; citation_issue=1; citation_publication_date=2012; citation_pages=59; citation_id=CR34
citation_journal_title=BMC Bioinformatics; citation_title=Development and implementation of an algorithm for detection ofprotein complexes in large interaction networks; citation_author=M Altaf-Ul-Amin, Y Shinbo, K Mihara, K Kurokawa, S Kanaya; citation_volume=7; citation_publication_date=2006; citation_pages=207-219; citation_doi=10.1186/1471-2105-7-207; citation_id=CR35