Geometrical inspired pre-weighting enhances Markov clustering community detection in complex networks
Tóm tắt
Markov clustering is an effective unsupervised pattern recognition algorithm for data clustering in high-dimensional feature space. However, its community detection performance in complex networks has been demonstrating results far from the state of the art methods such as Infomap and Louvain. The crucial issue is to convert the unweighted network topology in a ‘smart-enough’ pre-weighted connectivity that adequately steers the stochastic flow procedure behind Markov clustering. Here we introduce a conceptual innovation and we discuss how to leverage network latent geometry notions in order to design similarity measures for pre-weighting the adjacency matrix used in Markov clustering community detection. Our results demonstrate that the proposed strategy improves Markov clustering significantly, to the extent that it is often close to the performance of current state of the art methods for community detection. These findings emerge considering both synthetic ‘realistic’ networks (with known ground-truth communities) and real networks (with community metadata), and even when the real network connectivity is corrupted by noise artificially induced by missing or spurious links. Our study enhances the generalized understanding of how network geometry plays a fundamental role in the design of algorithms based on network navigability.
Tài liệu tham khảo
Adamic LA, Glance N (2005) The political blogosphere and the 2004 U.S. Election: divided they blog. LinkKDD 2005, pp 36–43
Athanassopoulos S, Kaklamanis C, Laftsidis I, Papaioannou E (2010) An experimental study of greedy routing algorithms. In: Proceedings of International Conference on High Performance Computing & Simulation, pp 150–156. https://doi.org/10.1109/HPCS.2010.5547143
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008:10008. https://doi.org/10.1088/1742-5468/2008/10/P10008
Boguñá M, Krioukov D, Claffy KC (2008) Navigability of complex networks. Nat Phys 5:74–80. https://doi.org/10.1038/nphys1130
Cannistraci CV, Muscoloni A (2018) Latent geometry inspired graph dissimilarities enhance affinity propagation community detection in complex networks. 20:063022. ArXiv: 180404566
Cannistraci CV, Ravasi T, Montevecchi FM, Ideker T, Alessio M (2010) Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes. Bioinformatics 26:i531–i539
Cannistraci CV, Alanis-Lobato G, Ravasi T (2013) Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding. Bioinformatics 29:199–209. https://doi.org/10.1093/bioinformatics/btt208
Clauset A, Rohilla Shalizi C, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51:661–703. https://doi.org/10.1214/13-AOAS710
Cross R, Parker A (2004) The hidden power of social networks. Harvard Business School Press, Brighton
Csárdi G, Nepusz T (2006) The igraph software package for complex network research. Int J Complex Syst. p 1695
Danon L, Diaz-Guilera A, Duch J, Arenas A (2005) Comparing community structure identification. J Stat Mech Theory Exp P09008:1–10
Fortunato S, Hric D (2016) Community detection in networks: a user guide. Phys Rep 659:1–44. https://doi.org/10.1016/j.physrep.2016.09.002
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. PNAS 99:7821–7826. https://doi.org/10.1073/pnas.122653799
Hric D, Darst RK, Fortunato S (2014) Community detection in networks: structural communities versus ground truth. Phys Rev E Stat Nonlinear Soft Matter Phys. https://doi.org/10.1103/PhysRevE.90.062805
Jalili M, Perc M (2017) Information cascades in complex networks. J Complex Netw 5:665–693. https://doi.org/10.1093/comnet/cnx019
Jia G, Cai Z, Musolesi M, Wang Y, Tennant DA, Weber RJM, Heath JK, He S (2012) Community detection in social and biological networks using differential evolution, pp 71–85. https://doi.org/10.1007/978-3-642-34413-8_6
Krioukov D, Papadopoulos F, Kitsak M, Vahdat A, Boguñá M (2010) Hyperbolic geometry of complex networks. Phys Rev E Stat Nonlinear Soft Matter Phys 82:036106. https://doi.org/10.1103/PhysRevE.82.036106
Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80:056117. https://doi.org/10.1103/PhysRevE.80.056117
Muscoloni A, Cannistraci CV (2018a) Minimum curvilinear automata with similarity attachment for network embedding and link prediction in the hyperbolic space. ArXiv: 180201183
Muscoloni A, Cannistraci CV (2018b) A nonuniform popularity-similarity optimization (nPSO) model to efficiently generate realistic complex networks with communities. New J Phys 20:052002. https://doi.org/10.1088/1367-2630/aac06f
Muscoloni A, Cannistraci CV (2018c) Leveraging the nonuniform PSO network model as a benchmark for performance evaluation in community detection and link prediction. New J Phys 20:063022
Muscoloni A, Cannistraci CV (2019) Navigability evaluation of complex networks by greedy routing efficiency. Proc Natl Acad Sci 116:1468–1469. https://doi.org/10.1073/pnas.1817880116
Muscoloni A, Thomas JM, Ciucci S, Bianconi G, Cannistraci CV (2017) Machine learning meets complex networks via coalescent embedding in the hyperbolic space. Nat Commun 8:1615
Orman GK, Labatut V (2009) A comparison of community detection algorithms on artificial networks. In: Discovery science, pp 242–256
Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P (2012a) Community detection in social media performance and application considerations. Data Min Knowl Discov 24:515–554. https://doi.org/10.1007/s10618-011-0224-z
Papadopoulos F, Kitsak M, Serrano MA, Boguñá M, Krioukov D (2012b) Popularity versus similarity in growing networks. Nature 489:537–540. https://doi.org/10.1038/nature11459
Rosvall M, Bergstrom CT (2011) Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PLoS ONE 6:e18209. https://doi.org/10.1371/journal.pone.0018209
Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows. In: Proceedings of 15th ACM SIGKDD international conference on knowledge discovery data mining—KDD ’09, p 737. https://doi.org/10.1145/1557019.1557101
Serrano MÁ, Krioukov D, Boguñá M (2008) Self-similarity of complex networks and hidden metric spaces. Phys Rev Lett 100:1–4. https://doi.org/10.1103/PhysRevLett.100.078701
van Dongen S (2000) Graph clustering by flow simulation. Graph Stimul by flow Clust. https://doi.org/10.1016/j.cosrev.2007.05.001
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Vlasblom J, Wodak SJ (2009) Markov clustering versus affinity propagation for the partitioning of protein interaction graphs. BMC Bioinform 10:99. https://doi.org/10.1186/1471-2105-10-99
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442. https://doi.org/10.1038/30918
Xie J, Szymanski BBK (2013) Labelrank: a stabilized label propagation algorithm for community detection in networks. Netw Sci Work: NSW 2013:138–143. https://doi.org/10.1109/NSW.2013.6609210
Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6:30750. https://doi.org/10.1038/srep30750
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473. https://doi.org/10.2307/3629752