Geometrical inspired pre-weighting enhances Markov clustering community detection in complex networks

Applied Network Science - Tập 6 - Trang 1-16 - 2021
Claudio Durán1, Alessandro Muscoloni1, Carlo Vittorio Cannistraci1,2
1Biomedical Cybernetics Group, Biotechnology Center (BIOTEC), Center for Molecular and Cellular Bioengineering (CMCB), Center for Systems Biology Dresden (CSBD), Department of Physics, Technische Universität Dresden, Dresden, Germany
2Center for Complex Network Intelligence (CCNI) at Tsinghua Laboratory of Brain and Intelligence (THBI), Department of Biomedical Engineering, Tsinghua University, Beijing, China

Tóm tắt

Markov clustering is an effective unsupervised pattern recognition algorithm for data clustering in high-dimensional feature space. However, its community detection performance in complex networks has been demonstrating results far from the state of the art methods such as Infomap and Louvain. The crucial issue is to convert the unweighted network topology in a ‘smart-enough’ pre-weighted connectivity that adequately steers the stochastic flow procedure behind Markov clustering. Here we introduce a conceptual innovation and we discuss how to leverage network latent geometry notions in order to design similarity measures for pre-weighting the adjacency matrix used in Markov clustering community detection. Our results demonstrate that the proposed strategy improves Markov clustering significantly, to the extent that it is often close to the performance of current state of the art methods for community detection. These findings emerge considering both synthetic ‘realistic’ networks (with known ground-truth communities) and real networks (with community metadata), and even when the real network connectivity is corrupted by noise artificially induced by missing or spurious links. Our study enhances the generalized understanding of how network geometry plays a fundamental role in the design of algorithms based on network navigability.

Tài liệu tham khảo

Adamic LA, Glance N (2005) The political blogosphere and the 2004 U.S. Election: divided they blog. LinkKDD 2005, pp 36–43 Athanassopoulos S, Kaklamanis C, Laftsidis I, Papaioannou E (2010) An experimental study of greedy routing algorithms. In: Proceedings of International Conference on High Performance Computing & Simulation, pp 150–156. https://doi.org/10.1109/HPCS.2010.5547143 Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008:10008. https://doi.org/10.1088/1742-5468/2008/10/P10008 Boguñá M, Krioukov D, Claffy KC (2008) Navigability of complex networks. Nat Phys 5:74–80. https://doi.org/10.1038/nphys1130 Cannistraci CV, Muscoloni A (2018) Latent geometry inspired graph dissimilarities enhance affinity propagation community detection in complex networks. 20:063022. ArXiv: 180404566 Cannistraci CV, Ravasi T, Montevecchi FM, Ideker T, Alessio M (2010) Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes. Bioinformatics 26:i531–i539 Cannistraci CV, Alanis-Lobato G, Ravasi T (2013) Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding. Bioinformatics 29:199–209. https://doi.org/10.1093/bioinformatics/btt208 Clauset A, Rohilla Shalizi C, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51:661–703. https://doi.org/10.1214/13-AOAS710 Cross R, Parker A (2004) The hidden power of social networks. Harvard Business School Press, Brighton Csárdi G, Nepusz T (2006) The igraph software package for complex network research. Int J Complex Syst. p 1695 Danon L, Diaz-Guilera A, Duch J, Arenas A (2005) Comparing community structure identification. J Stat Mech Theory Exp P09008:1–10 Fortunato S, Hric D (2016) Community detection in networks: a user guide. Phys Rep 659:1–44. https://doi.org/10.1016/j.physrep.2016.09.002 Girvan M, Newman MEJ (2002) Community structure in social and biological networks. PNAS 99:7821–7826. https://doi.org/10.1073/pnas.122653799 Hric D, Darst RK, Fortunato S (2014) Community detection in networks: structural communities versus ground truth. Phys Rev E Stat Nonlinear Soft Matter Phys. https://doi.org/10.1103/PhysRevE.90.062805 Jalili M, Perc M (2017) Information cascades in complex networks. J Complex Netw 5:665–693. https://doi.org/10.1093/comnet/cnx019 Jia G, Cai Z, Musolesi M, Wang Y, Tennant DA, Weber RJM, Heath JK, He S (2012) Community detection in social and biological networks using differential evolution, pp 71–85. https://doi.org/10.1007/978-3-642-34413-8_6 Krioukov D, Papadopoulos F, Kitsak M, Vahdat A, Boguñá M (2010) Hyperbolic geometry of complex networks. Phys Rev E Stat Nonlinear Soft Matter Phys 82:036106. https://doi.org/10.1103/PhysRevE.82.036106 Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80:056117. https://doi.org/10.1103/PhysRevE.80.056117 Muscoloni A, Cannistraci CV (2018a) Minimum curvilinear automata with similarity attachment for network embedding and link prediction in the hyperbolic space. ArXiv: 180201183 Muscoloni A, Cannistraci CV (2018b) A nonuniform popularity-similarity optimization (nPSO) model to efficiently generate realistic complex networks with communities. New J Phys 20:052002. https://doi.org/10.1088/1367-2630/aac06f Muscoloni A, Cannistraci CV (2018c) Leveraging the nonuniform PSO network model as a benchmark for performance evaluation in community detection and link prediction. New J Phys 20:063022 Muscoloni A, Cannistraci CV (2019) Navigability evaluation of complex networks by greedy routing efficiency. Proc Natl Acad Sci 116:1468–1469. https://doi.org/10.1073/pnas.1817880116 Muscoloni A, Thomas JM, Ciucci S, Bianconi G, Cannistraci CV (2017) Machine learning meets complex networks via coalescent embedding in the hyperbolic space. Nat Commun 8:1615 Orman GK, Labatut V (2009) A comparison of community detection algorithms on artificial networks. In: Discovery science, pp 242–256 Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P (2012a) Community detection in social media performance and application considerations. Data Min Knowl Discov 24:515–554. https://doi.org/10.1007/s10618-011-0224-z Papadopoulos F, Kitsak M, Serrano MA, Boguñá M, Krioukov D (2012b) Popularity versus similarity in growing networks. Nature 489:537–540. https://doi.org/10.1038/nature11459 Rosvall M, Bergstrom CT (2011) Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PLoS ONE 6:e18209. https://doi.org/10.1371/journal.pone.0018209 Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows. In: Proceedings of 15th ACM SIGKDD international conference on knowledge discovery data mining—KDD ’09, p 737. https://doi.org/10.1145/1557019.1557101 Serrano MÁ, Krioukov D, Boguñá M (2008) Self-similarity of complex networks and hidden metric spaces. Phys Rev Lett 100:1–4. https://doi.org/10.1103/PhysRevLett.100.078701 van Dongen S (2000) Graph clustering by flow simulation. Graph Stimul by flow Clust. https://doi.org/10.1016/j.cosrev.2007.05.001 Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854 Vlasblom J, Wodak SJ (2009) Markov clustering versus affinity propagation for the partitioning of protein interaction graphs. BMC Bioinform 10:99. https://doi.org/10.1186/1471-2105-10-99 Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442. https://doi.org/10.1038/30918 Xie J, Szymanski BBK (2013) Labelrank: a stabilized label propagation algorithm for community detection in networks. Netw Sci Work: NSW 2013:138–143. https://doi.org/10.1109/NSW.2013.6609210 Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6:30750. https://doi.org/10.1038/srep30750 Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473. https://doi.org/10.2307/3629752