Ego-zones: non-symmetric dependencies reveal network groups with large and dense overlaps

Applied Network Science - Tập 4 Số 1 - 2019
Miloš Kudělka1, Eliška Ochodková1, Šárka Zehnalová1, Jakub Plesnik1
1Dept. of Computer Science, Faculty of Electrical Engineering and Computer Science, VSB-Technical University of Ostrava, 17. listopadu 2172/15, Ostrava-Poruba, 708 00, Czech Republic

Tóm tắt

AbstractThe existence of groups of nodes with common characteristics and the relationships between these groups are important factors influencing the structures of social, technological, biological, and other networks. Uncovering such groups and the relationships between them is, therefore, necessary for understanding these structures. Groups can either be found by detection algorithms based solely on structural analysis or identified on the basis of more in-depth knowledge of the processes taking place in networks. In the first case, these are mainly algorithms detecting non-overlapping communities or communities with small overlaps. The latter case is about identifying ground-truth communities, also on the basis of characteristics other than only network structure. Recent research into ground-truth communities shows that in real-world networks, there are nested communities or communities with large and dense overlaps which we are not yet able to detect satisfactorily only on the basis of structural network properties.In our approach, we present a new perspective on the problem of group detection using only the structural properties of networks. Its main contribution is pointing out the existence of large and dense overlaps of detected groups. We use the non-symmetric structural similarity between pairs of nodes, which we refer to as dependency, to detect groups that we call zones. Unlike other approaches, we are able, thanks to non-symmetry, accurately to describe the prominent nodes in the zones which are responsible for large zone overlaps and the reasons why overlaps occur. The individual zones that are detected provide new information associated in particular with the non-symmetric relationships within the group and the roles that individual nodes play in the zone. From the perspective of global network structure, because of the non-symmetric node-to-node relationships, we explore new properties of real-world networks that describe the differences between various types of networks.

Từ khóa


Tài liệu tham khảo

Abbasi, A, Chung KSK, Hossain L (2012) Egocentric analysis of co-authorship network structure, position and performance. Inf Process Manag 48(4):671–679.

Agrawal, M, Zitnik M, Leskovec J, et al (2018) Large-scale analysis of disease pathways in the human interactome. Pac Symp Biocomput 23:111–122. World Scientific.

Ahn, Y-Y, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761.

Aiello, LM, Deplano M, Schifanella R, Ruffo G (2012) People are strange when you’re a stranger: Impact and influence of bots on social networks In: ICWSM’12: Proceedings of the 6th AAAI International Conference on Weblogs and Social Media.. AAAI.

Albert, R, Barabási A-L (2002) Statistical mechanics of complex networks. Rev Mod Phys 74(1):47.

Bagrow, JP, Bollt EM (2005) Local method for detecting communities. Phys Rev E 72(4):046108.

Barnes, ER (1982) An algorithm for partitioning the nodes of a graph. SIAM J Algebraic Discret Methods 3(4):541–550.

Bashan, A, Parshani R, Havlin S (2011) Percolation in networks composed of connectivity and dependency links. Phys Rev E 83(5):051127.

Baumes, J, Goldberg MK, Krishnamoorthy MS, Magdon-Ismail M, Preston N (2005) Finding communities by clustering a graph into overlapping subgraphs. IADIS AC 5:97–104.

Bianconi, G, Darst RK, Iacovacci J, Fortunato S (2014) Triadic closure as a basic generating mechanism of communities in complex networks. Phys Rev E 90(4):042806.

Blondel, VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):10008.

Bu, D, Zhao Y, Cai L, Xue H, Zhu X, Lu H, Zhang J, Sun S, Ling L, Zhang N, et al (2003) Topological structure analysis of the protein–protein interaction network in budding yeast. Nucleic Acids Res 31(9):2443–2450.

Chakraborty, T, Dalmia A, Mukherjee A, Ganguly N (2017) Metrics for community analysis: A survey. ACM Comput Surv (CSUR) 50(4):54.

Cho, E, Myers SA, Leskovec J (2011) Friendship and mobility: user movement in location-based social networks In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’11, 1082–1090.. ACM. https://doi.org/10.1145/2020408.2020579 .

Clauset, A (2005) Finding local community structure in networks. Phys Rev E 72(2):026132.

Danisch, M, Guillaume J-L, Le Grand B (2013) Towards multi-ego-centred communities: a node similarity approach. Int J Web Based Communities 9(3):299–322.

Erdös, P, Rényi A (1959) On random graphs, i. Publ Math (Debrecen) 6:290–297.

Evans, TS, Lambiotte R (2010) Line graphs of weighted networks for overlapping communities. Eur Phys J B 77(2):265–272.

Fortunato, S (2010) Community detection in graphs. Phys Rep 486(3-5):75–174.

Fortunato, S, Hric D (2016) Community detection in networks: A user guide. Phys Rep 659:1–44.

Freeman, LC (1977) A set of measures of centrality based on betweenness. Sociometry 40(1):35–41. https://doi.org/10.2307/3033543 .

Girvan, M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826.

Gregory, S (2010) Finding overlapping communities in networks by label propagation. New J Phys 12(10):103018.

Guimera, R, Sales-Pardo M, Amaral LAN (2004) Modularity from fluctuations in random graphs and complex networks. Phys Rev E 70(2):025101.

Hric, D, Darst RK, Fortunato S (2014) Community detection in networks: Structural communities versus ground truth. Phys Rev E 90(6):062805.

Jacob, Y, Winetraub Y, Raz G, Ben-Simon E, Okon-Singer H, Rosenberg-Katz K, Hendler T, Ben-Jacob E (2016) Dependency network analysis (d ep na) reveals context related influence of brain network nodes. Sci Rep 6:27444.

Kernighan, BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49(2):291–307.

Khorasgani, RR, Chen J, Zaïane OR (2010) Top leaders community detection approach in information networks In: 4th SNA-KDD Workshop on Social Network Mining and Analysis.. ACM.

Knuth, DE (1993) The Stanford GraphBase: A Platform for Combinatorial Algorithms In: Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, 41–43, Philadelphia.

Kudelka, M, Zehnalova S, Horak Z, Kromer P, Snasel V (2015) Local dependency in networks. Int J Appl Math Comput Sci 25(2):281–293.

Lancichinetti, A, Fortunato S (2009) Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys Rev E 80(1):016118.

Lancichinetti, A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015.

Leskovec, J, Lang KJ, Dasgupta A, Mahoney MW (2009) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Math 6(1):29–123.

McAuley, J, Leskovec J (2014) Discovering social circles in ego networks. ACM Trans Knowl Discov Data (TKDD) 8(1):4.

Mislove, A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks In: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, 29–42.. ACM, New York.

Newman, ME (2001) The structure of scientific collaboration networks. Proc Natl Acad Sci 98(2):404–409.

Newman, ME (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104.

Newman, ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113.

Palla, G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814.

Parshani, R, Buldyrev SV, Havlin S (2011) Critical effect of dependency groups on the function of networks. Proc Natl Acad Sci 108(3):1007–1010.

Pons, P, Latapy M (2005) Computing communities in large networks using random walks In: International Symposium on Computer and Information Sciences, 284–293.. Springer. https://doi.org/10.1007/11569596_31 .

Poulin, V, Théberge F (2018) Ensemble clustering for graphs In: International Conference on Complex Networks and their Applications, 231–243.. Springer.

Rosvall, M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci U S A 105(4):1118–1123.

Rozemberczki, B, Davies R, Sarkar R, et al. (2018) Gemsec: Graph embedding with self clustering. arXiv preprint arXiv:1802.03997.

Tatti, N, Gionis A (2013) Discovering nested communities In: Machine Learning and Knowledge Discovery in Databases, 32–47.. Springer, Berlin.

Tversky, A (1977) Features of similarity. Psychol Rev 84(4):327.

Watts, DJ, Strogatz SH (1998) Collective dynamics of ’small-world’networks. Nature 393(6684):440.

Wishart, DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, et al (2017) Drugbank 5.0: a major update to the drugbank database for 2018. Nucleic Acids Res 46(D1):1074–1082.

Xie, J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Comput Surv (csur) 45(4):43.

Yang, J, Leskovec J (2012) Community-affiliation graph model for overlapping network community detection In: 2012 IEEE 12th International Conference on Data Mining, 1170–1175.. IEEE. https://doi.org/10.1109/icdm.2012.139 .

Yang, J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213.

Yang, Y, Klimmt B (2004) The Enron corpus: A new dataset for email classification research In: European Conference on Machine Learning, 217–226.. Springer.

Zachary, WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473.

Zhang, J, Cheng J, Su X, Yin X, Zhao S, Chen X (2018) Correlation Analysis of Nodes Identifies Real Communities in Networks. arXiv preprint arXiv:1804.06005.

Zitnik, M, Agrawal M, Leskovec J (2018) Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics 34(13):i457–i466.