A Comparative Analysis of Community Detection Algorithms on Artificial Networks

Scientific Reports - Tập 6 Số 1
Zhao Yang1, René Algesheimer1, Claudio J. Tessone1
1URPP Social Networks, University of Zürich, Andreasstrasse 15, Zürich, CH-8050, Switzerland

Tóm tắt

AbstractMany community detection algorithms have been developed to uncover the mesoscopic properties of complex networks. However how good an algorithm is, in terms of accuracy and computing time, remains still open. Testing algorithms on real-world network has certain restrictions which made their insights potentially biased: the networks are usually small, and the underlying communities are not defined objectively. In this study, we employ the Lancichinetti-Fortunato-Radicchi benchmark graph to test eight state-of-the-art algorithms. We quantify the accuracy using complementary measures and algorithms’ computing time. Based on simple network properties and the aforementioned results, we provide guidelines that help to choose the most adequate community detection algorithm for a given network. Moreover, these rules allow uncovering limitations in the use of specific algorithms given macroscopic network properties. Our contribution is threefold: firstly, we provide actual techniques to determine which is the most suited algorithm in most circumstances based on observable properties of the network under consideration. Secondly, we use the mixing parameter as an easily measurable indicator of finding the ranges of reliability of the different algorithms. Finally, we study the dependency with network size focusing on both the algorithm’s predicting power and the effective computing time.

Từ khóa


Tài liệu tham khảo

Newman, M. E. The structure and function of complex networks. SIAM Review 45, 167–256 (2003).

Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D.-U. Complex networks: Structure and dynamics. Physics Reports 424, 175–308 (2006).

Girvan, M. & Newman, M. E. Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 7821–7826 (2002).

Fortunato, S. Community detection in graphs. Physics Reports 486, 75–174 (2010).

Lancichinetti, A., Fortunato, S. & Radicchi, F. Benchmark graphs for testing community detection algorithms. Physical Review E 78, 046110 (2008).

Lancichinetti, A., Kivelä, M., Saramäki, J. & Fortunato, S. Characterizing the community structure of complex networks. PloS ONE 5, e11976 (2010).

Condon, A. & Karp, R. M. Algorithms for graph partitioning on the planted partition model. Random Structures and Algorithms 18, 116–140 (2001).

Danon, L., Díaz-Guilera, A., Duch, J. & Arenas, A. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment 2005, P09008 (2005).

Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).

Palla, G., Derényi, I., Farkas, I. & Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005).

Guimerà, R., Danon, L., Díaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar community structure in a network of human interactions. Physical Review E 68, 065103 (2003).

Clauset, A., Newman, M. E. & Moore, C. Finding community structure in very large networks. Physical Review E 70, 066111 (2004).

Lancichinetti, A. & Fortunato, S. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Physical Review E 80, 016118 (2009).

Orman, G. K. & Labatut, V. A comparison of community detection algorithms on artificial networks. In Discovery Science 242–256 (Springer, 2009).

Lancichinetti, A. & Fortunato, S. Community detection algorithms: a comparative analysis. Physical Review E 80, 056117 (2009).

Radicchi, F., Castellano, C., Cecconi, F., Loreto, V. & Parisi, D. Defining and identifying communities in networks. Proceedings of the National Academy of Sciences 101, 2658–2663 (2004).

Peel, L. Estimating network parameters for selecting community detection algorithms. In 13th Conference on Information Fusion 1–8 (IEEE, 2010).

Hric, D., Darst, R. K. & Fortunato, S. Community detection in networks: Structural communities versus ground truth. Physical Review E 90, 062805 (2014).

Yang, J. & Leskovec, J. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems 42, 181–213 (2015).

Csardi, G. & Nepusz, T. The igraph software package for complex network research. InterJournal, Complex Systems 1695, URL http://igraph.org (2006).

Vinh, N. X., Epps, J. & Bailey, J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. The Journal of Machine Learning Research 11, 2837–2854 (2010).

Romano, S., Bailey, J., Nguyen, V. & Verspoor, K. Standardized mutual information for clustering comparisons: one step further in adjustment for chance. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), 1143–1151 (2014).

Zhang, P. Evaluating accuracy of community detection using the relative normalized mutual information. Journal of Statistical Mechanics: Theory and Experiment 2015, P11006 (2015).

Papadopoulos, S., Kompatsiaris, Y., Vakali, A. & Spyridonos, P. Community detection in social media. Data Mining and Knowledge Discovery 24, 515–554 (2012).

Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, P10008 (2008).

Fortunato, S. & Barthélemy, M. Resolution limit in community detection. Proceedings of the National Academy of Sciences 104, 36–41 (2007).

Newman, M. & Clauset, A. Structure and inference in annotated networks. arXiv preprint arXiv:1507.04001 (2015).

Zachary, W. W. An information flow model for conflict and fission in small groups. Journal of Anthropological Research 452–473 (1977).

Danon, L., Díaz-Guilera, A. & Arenas, A. The effect of size heterogeneity on community identification in complex networks. Journal of Statistical Mechanics: Theory and Experiment 2006, P11010 (2006).

Molloy, M. & Reed, B. A critical point for random graphs with a given degree sequence. Random Structures and Algorithms 6, 161–180 (1995).

Bagrow, J. P. Evaluating local community methods in networks. Journal of Statistical Mechanics: Theory and Experiment 2008, P05001 (2008).

Poncela, J., Gómez-Gardeñes, J., Floria, L. M., Sánchez, A. & Moreno, Y. Complex cooperative networks from evolutionary preferential attachment. PLoS One 3, e2449 (2008).

Orman, G. K. & Labatut, V. The effect of network realism on community detection algorithms. In International Conference on Advances in Social Networks Analysis and Mining 301–305 (IEEE, 2010).

Freeman, L. C. Centrality in social networks conceptual clarification. Social Networks 1, 215–239 (1979).

Rosvall, M. & Bergstrom, C. T. An information-theoretic framework for resolving community structure in complex networks. Proceedings of the National Academy of Sciences 104, 7327–7331 (2007).

Rosvall, M., Axelsson, D. & Bergstrom, C. T. The map equation. The European Physical Journal Special Topics 178, 13–23 (2010).

Mukherjee, A., Choudhury, M., Peruani, F., Ganguly, N. & Mitra, B. Dynamics On and Of Complex Networks, Volume 2: Applications to Time-Varying Dynamical Systems (Springer Science & Business Media, 2013).

Raghavan, U. N., Albert, R. & Kumara, S. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E 76, 036106 (2007).

Newman, M. E. Finding community structure in networks using the eigenvectors of matrices. Physical Review E 74, 036104 (2006).

Xie, J. & Szymanski, B. K. Community detection using a neighborhood strength driven label propagation algorithm. In Network Science Workshop 188–195 (IEEE, 2011).

Reichardt, J. & Bornholdt, S. Statistical mechanics of community detection. Physical Review E 74, 016110 (2006).

Wu, F.-Y. The Potts model. Reviews of Modern Physics 54, 235 (1982).

Kirkpatrick, S. Optimization by simulated annealing: Quantitative studies. Journal of Statistical Physics 34, 975–986 (1984).

Traag, V. & Bruggeman, J. Community detection in networks with positive and negative links. Physical Review E 80, 036115 (2009).

Dahlin, J. & Svenson, P. Ensemble approaches for improving community detection methods. arXiv:1309.0242 [physics.soc-ph] (2013).

Pons, P. & Latapy, M. Computing communities in large networks using random walks. In Computer and Information Sciences-ISCIS 2005, 284–293 (Springer, 2005).