Discovering top-weighted k-truss communities in large graphs

Journal of Big Data - Tập 9 - Trang 1-25 - 2022
Wafaa M. A. Habib1, Hoda M. O. Mokhtar1,2, Mohamed E. El-Sharkawi1,2
1Information Systems Department, Faculty of Computers and Artificial Intelligence, Cairo University, Cairo, Egypt
2Faculty of Computing and Information Sciences, Egypt University of Informatics, Cairo, Egypt

Tóm tắt

Community Search is the problem of querying networks in order to discover dense subgraphs-communities-that satisfy given query parameters. Most community search models consider link structure and ignore link weight while answering the required queries. Given the importance of link weight in different networks, this paper considers both link structure and link weight to discover top-r weighted k-truss communities via community search. The top-weighted k-truss communities are those communities with the highest weight and the highest cohesiveness within the network. All recent studies that considered link weight discover top-weighted communities via global search and index-based search techniques. In this paper three different algorithms are proposed to scale-up the existing approaches of weighted community search via local search. The performance evaluation shows that the proposed algorithms significantly outperform the existing state-of-the-art algorithms over different datasets in terms of search time by several orders of magnitude.

Tài liệu tham khảo

Cohen J. Trusses: Cohesive subgraphs for social network analysis. Natl Secur Agency Tech Rep. 2008;16:3–1. Wang J, Cheng J. Truss decomposition in massive networks. Proc VLDB Endow. 2012;5(9):812–23. Li RH, Qin L, Yu JX, Mao R. Influential community search in large networks. Proc VLDB Endow. 2015;8(5):509–20. Cui W, Xiao Y, Wang H, Wang W. Local Search of Communities in Large Graphs. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. SIGMOD ’14. New York, NY, USA: ACM; 2014. p. 991–1002. http://doi.acm.org/10.1145/2588555.2612179. Huang X, Cheng H, Qin L, Tian W, Yu JX. Querying k-truss community in large and dynamic graphs. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM; 2014. p. 1311–1322. Akbas E, Zhao P. Truss-based community search: a truss-equivalence based indexing approach. Proc VLDB Endow. 2017;10(11):1298–309. Huang X, Lakshmanan LV. Attribute-driven community search. Proc VLDB Endow. 2017;10(9):949–60. Chen S, Wei R, Popova D, Thomo A. Efficient computation of importance based communities in web-scale networks using a single machine. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM; 2016. p. 1553–1562. Fortunato S. Community detection in graphs. Phys Rep. 2010;486(3–5):75–174. Zheng Z, Ye F, Li RH, Ling G, Jin T. Finding weighted k-truss communities in large networks. Inf Sci. 2017;417:344–60. Garas A, Argyrakis P, Rozenblat C, Tomassini M, Havlin S. Worldwide spreading of economic crisis. New J Phys. 2010;12(11): 113043. Bi F, Chang L, Lin X, Zhang W. An optimal and progressive approach to online search of top-k influential communities. Proc VLDB Endow. 2018;11(9):1056–68. Chang L, Li W, Qin L, Zhang W, Yang S. Fast and exact structural graph clustering. IEEE Trans Knowl Data Eng. 2017;29(2):387–401. Shao J, Han Z, Yang Q, Zhou T. Community Detection Based on Distance Dynamics. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’15. New York, NY, USA: ACM; 2015. p. 1075–1084. http://doi.acm.org/10.1145/2783258.2783301. Yang J, Leskovec J. Defining and evaluating network communities based on ground-truth. Knowl Inf Syst. 2015;42(1):181–213. Huang X, Lu W, Lakshmanan LV. Truss decomposition of probabilistic graphs: Semantics and algorithms. In: Proceedings of the 2016 International Conference on Management of Data; 2016. p. 77–90. Cheng J, Ke Y, Fu AWC, Yu JX, Zhu L. Finding maximal cliques in massive networks. ACM Trans Database Syst. 2011;36(4):21. Cheng J, Zhu L, Ke Y, Chu S. Fast algorithms for maximal clique enumeration with limited memory. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2012. p. 1240–1248. Tsourakakis C, Bonchi F, Gionis A, Gullo F, Tsiarli M. Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2013. p. 104–112. Cheng J, Ke Y, Chu S, Özsu MT. Efficient core decomposition in massive networks. In: 2011 IEEE 27th International Conference on Data Engineering. IEEE; 2011. p. 51–62. Khaouid W, Barsky M, Srinivasan V, Thomo A. K-core decomposition of large networks on a single PC. Proc VLDB Endow. 2015;9(1):13–23. Charikar M. Greedy approximation algorithms for finding dense components in a graph. In: International Workshop on Approximation Algorithms for Combinatorial Optimization. Springer; 2000. p. 84–95. Goldberg AV. Finding a maximum density subgraph. In: Tech. Report No. UCB CSD 84/171. Computer Science Division (EECS), University of California, Berkeley, CA, 1984. Chang L, Yu JX, Qin L, Lin X, Liu C, Liang W. Efficiently computing k-edge connected components via graph decomposition. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM; 2013. p. 205–216. Zhou R, Liu C, Yu JX, Liang W, Chen B, Li J. Finding maximal k-edge-connected subgraphs from a large graph. In: Proceedings of the 15th International Conference on Extending Database Technology. ACM; 2012. p. 480–491. Alemi M, Haghighi H. KTMiner: distributed k-truss detection in big graphs. Inf Syst. 2019;83:195–216. Sozio M, Gionis A. The community-search problem and how to plan a successful cocktail party. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2010. p. 939–948. Cui W, Xiao Y, Wang H, Lu Y, Wang W. Online search of overlapping communities. In: Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM; 2013. p. 277–288. Wu Y, Jin R, Li J, Zhang X. Robust local community detection: on free rider effect and its elimination. Proc VLDB Endow. 2015;8(7):798–809. Zhu Y, He J, Ye J, Qin L, Huang X, Yu JX. When Structure Meets Keywords: Cohesive Attributed Community Search. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management; 2020. p. 1913–1922. Habib WM, Mokhtar HM, El-Sharkawi ME. Weight-Based K-Truss Community Search via Edge Attachment. IEEE Access. 2020;8:148841–148852. Shao Y, Chen L, Cui B. Efficient cohesive subgraphs detection in parallel. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data; 2014. p. 613–624. Latapy M. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor Comput Sci. 2008;407(1–3):458–73. https://snap.stanford.edu/data/.