A review of stochastic block models and extensions for graph clustering
Tóm tắt
Từ khóa
Tài liệu tham khảo
Abbe, E (2018) Community detection and stochastic block models: recent developments. J Mach Learn Res 18:1–86.
Adamic, L, Glance N (2005) The political blogosphere and the 2004 u.s. election: divided they blog In: Proceedings of the WWW-2005 Workshop on the Weblogging Ecosystem. https://doi.org/10.1145/1134271.1134277.
Ahmed, A, Xing EP (2010) Timeline: a dynamic hierarchical dirichlet process model for recovering birth/death and evolution of topics in text stream. In: Grunwald P Spirtes P (eds)Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI’10), 20–29.. AUAI Press, Arlington.
Aicher, C, Jacobs AZ, Clauset A (2015) Learning latent block structure in weighted networks. J Complex Netw 3:221–248.
Airoldi, EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014. http://www.jmlr.org/papers/v9/airoldi08a.html.
Arun, R, Suresh V, Madhavan CEV, Murty MN (2010) On finding the natural number of topics with Latent Dirichlet Allocation: some observations In: PAKDD’10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining. https://doi.org/10.1007/978-3-642-13657-3_43.
Bickel, P. J, Chen A (2009) A nonparametric view of network models and Newman-Girvan and other modularities. Proc Natl Acad Sci 106(50):21068–21073.
Blei, DM, Lafferty JD (2006) Dynamic topic models In: Proceedings of the 23rd International Conference on Machine Learning. https://doi.org/10.1145/1143844.1143859.
Blei, D. M, Ng A. Y, Jordan M. I (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022. http://www.jmlr.org/papers/v3/blei03a.html.
Blondel, V. D, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp. https://doi.org/10.1088/1742-5468/2008/10/p10008.
Bouveyron, C, Latouche P, Zreik R (2016) The stochastic topic block model for the clustering of networks with textual edges. Working paper or preprint. https://hal.archives-ouvertes.fr/hal-01299161.
Breiger, RL, Boorman SA, Arabie P (1975) An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling. J Math Psychol 12(3):328–383.
Chang, J, Blei D. M (2010) Hierarchical relational models for document networks. Ann Appl Stat 4(1):124–150.
Chen, M, Kuzmin K, Szymanski B. K (2014) Community detection via maximization of modularity and its variants. IEEE Trans Comput Soc Syst 1(1):46–65.
Cherifi, H, Palla G, Szymanski BK, Lu X (2019) On community structure in complex networks: challenges and opportunities. ArXiv e-prints. arXiv:1908.04901.
Clauset, A, Newman M. E. J, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066111.
Côme, E, Latouche P (2015) Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood. Stat Model 15(6):564–589.
Corneli, M, Bouveyron C, Latouche P, Rossi F (2018) The dynamic stochastic topic block model for dynamic networks with textual edges. Stat Comput. https://doi.org/10.1007/s11222-018-9832-4.
Daudin, J. J, Picard F, Robin S (2008) A mixture model for random graphs. Stat Comput 18(2):173–183.
Decelle, A, Krzakala F, Moore C, Zdeborová L (2011) Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys Rev E 84:066106.
Dempster, A. P, Laird NM, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38. http://www.jstor.org/stable/2984875.
DuBois, C, Butts C, Smyth P (2013) Stochastic blockmodeling of relational event dynamics In: Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS) 2013.. Scottsdale, AZ, USA. Volume 31 of JMLR: W&CP 31.
Fan, X, Cao L, Xu RYD (2015) Dynamic infinite mixed-membership stochastic blockmodel. IEEE Trans Neural Netw Learn Syst 26(9):2072–2085.
Fan, X, Xu RYD, Cao L (2016) Copula mixed-membership stochastic blockmodel In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16, 1462–1468.. AAAI Press. http://dl.acm.org/citation.cfm?id=3060621.3060824. Accessed 1 Jan 2016.
Fortunato, S, Barthélemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41.
Fu, W, Song L, Xing EP (2009) Dynamic mixed membership blockmodel for evolving networks In: Proceedings of the 26th International Conference on Machine Learning, 329–336. https://doi.org/10.1145/1553374.1553416.
Ghasemian, A, Zhang P, Clauset A, Moore C, Peel L (2016) Detectability threshold and optimal algorithms for community structure dynamic networks. Phys Rev X 6:031005.
Girvan, M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826.
Godoy-Lorite, A, Guimerà R, Moore C, Sales-Pardo M (2016) Accurate and scalable social recommendation using mixed-membership stochastic block models. Proc Natl Acad Sci 113(50):7–14212.
Good, BH, de Montjoye Y-A, Clauset A (2010) Performance of modularity maximization in practical contexts. Phys Rev E 81:046106.
Gopalan, PK, Mimno DM, Gerrish S, Freedman M, Blei DM (2012) Scalable inference of overlapping communities. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds)Advances in, Neural Information Processing Systems 25 Curran Associates, Inc., 2249–2257. http://papers.nips.cc/paper/4573-scalable-inference-o%f-overlapping-communities.pdf.
Griffiths, TL, Ghahramani Z (2011) The Indian buffet process: An introduction and review. J Mach Learn Res 12:1185–1224.
Grünwald, PD (2007) The Minimum Description Length Principle. The MIT Press. https://doi.org/10.1007/springerreference_179252.
Guimerà, R, Sales-Pardo M (2009) Missing and spurious interactions and the reconstruction of complex networks. Proc Natl Acad Sci 106(52):22073–22078.
Handcock, MS, Raftery AE, Tantrum JM (2007) Model-based clustering for social networks. J R Stat Soc Ser A (Stat Soc) 170:301–354.
Hayashi, K, Knishi T, Kawamoto T (2016) A tractable fully Bayesian method for the stochastic block model. ArXiv e-prints. arXiv:1602.02256.
Ho, Q, Eisenstein J, Xing EP (2012) Document hierarchies from text and links In: Proceedings of the 21st International Conference on World Wide Web, 739–748. https://doi.org/10.1145/2187836.2187936.
Hoff, PD, Raftery AE, Handcock MS (2002) Latent space approaches to social network analysis. J Am Stat Assoc 57(460):1090–1098.
Holland, PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: First steps. Soc Netw 5(2):109–137.
Holland, PW (1981) An exponential family of probability distributions for directed graphs. J Am Stat Assoc 76(373):33–50.
Hu, J, Qin H, Yan T, Zhao Y (2019) Corrected Bayesian information criterion for stochastic block models. J Am Stat Assoc. https://doi.org/10.1080/01621459.2019.1637744.
Ji, P, Jin J (2016) Coauthorship and citation networks for statisticians. Ann Appl Stat 10(4):1779–1812.
Karrer, B, Newman MEJ (2011) Stochastic blockmodels and community structure in networks. Phys Rev E 83:016107.
Kim, DI, Gopalan PK, Blei D, Sudderth E (2013) Efficient online inference for Bayesian nonparametric relational models. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds)Advances in Neural Information Processing Systems 26.. Curran Associates, Inc.
Klimt, B, Yang Y (2004) The Enron Corpus: A new dataset for email classification research. In: Boulicaut J-F, Esposito F, Giannotti F, Pedreschi D (eds)Machine Learning: ECML 2004’, Vol. 3201 of Lecture Notes in Computer Science, 217–226.. Springer, Berlin Heidelberg.
Kurihara, K, Kameya Y, Sato T (2006) A frequency-based stochastic blockmodel In: Workshop on Information-Based Induction Sciences, Osaka.
Lancichinetti, A, Fortunato S (2011) Limits of modularity maximization in community detection. Phys Rev E 84:066122.
Latouche, P, Birmelé E, Ambroise C (2012) Variational Bayesian inference and complexity control for stochastic block models. Stat Model 12(1):93–115.
Lee, C, Wilkinson D. J (2018) A social network analysis of articles on social network analysis. ArXiv e-prints. arXiv:1810.09781.
Li, W, Ahn S, Welling M (2016) Scalable MCMC for mixed membership stochastic blockmodels. In: Gretton A Robert CC (eds)Proceedings of the 19th, International Conference on Artificial Intelligence and Statistics, Vol. 51 of Proceedings of Machine Learning Research, 723–731.. PMLR, Cadiz. http://proceedings.mlr.press/v51/li16d.html.
Liu, Y, Niculescu-Mizil A, Gryc W (2009) Topic-link LDA: Joint models of topic and author community In: Proceedings of the 26th International Conference on Machine Learning, 665–672. https://doi.org/10.1145/1553374.1553460.
Lu, X, Szymanski BK (2019) Asymptotic resolution bounds of generalized modularity and statistically significant community detection. ArXiv e-prints. arXiv:1902.04243.
Lu, X, Szymanski BK (2019) A regularized stochastic block model for the robust communitydetection in complex networks. Sci Rep 9. https://doi.org/10.1038/s41598-019-49580-5.
Ludkin, M, Eckley I, Neal P (2018) Dynamic stochastic block models: parameter estimation and detection of changes in community structure. Stat Comput 28:1201–1213.
Lunagómez, S, Mukherjee S, Wolpert RL, Airoldi EM (2017) Geometric representations of random hypergraphs. J Am Stat Assoc 112(517):363–383.
Matias, C, Miele V (2017) Statistical clustering of temporal networks through a dynamic stochastic block model. J R Stat Soc Ser B (Stat Methodol) 79(4):1119–1141.
Matias, C, Rebafka T, Villers F (2015) Estimation and clustering in a semiparametric poisson process stochastic block model for longitudinal networks. ArXiv e-prints. arXiv:1512.07075v1.
McCallum, A, Wang X, Corrada-Emmanuel A (2007) Topic and role discovery in social networks with experiments on Enron and academic email. J Artif Intell Res 30:249–272.
McDaid, AF, Murphy TB, Friel N, Hurley NJ (2013) Improved Bayesian inference for the stochastic block model with application to large networks. Comput Stat Data Anal 60:12–31.
Miller, K, Jordan MI, Griffiths TL (2009) Nonparametric latent feature models for link prediction. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds)Advances in Neural Information Processing Systems 22, 1276–1284.. Curran Associates, Inc.
Mørup, M, Schmidt MN, Hansen LK (2011) Infinite multiple membership relational modeling for complex networks In: Proceedings of the 2011 IEEE International Workshop on Machine Learning for Signal Processing. https://doi.org/10.1109/mlsp.2011.6064546.
Newman, MEJ (2001) Scientific collaboration networks. I. network construction and fundamental results. Phys Rev E 64:016131. https://doi.org/10.1103/physreve.64.016131.
Newman, MEJ (2001) The structure of scientific collaboration networks. Proc Natl Acad Sci 98(2):404–409.
Newman, MEJ (2004) Coauthorship networks and patterns of scientific collaboration. Proc Natl Acad Sci 101:5200–5205.
Newman, MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74:036104.
Newman, MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582.
Newman, MEJ (2016) Equivalence between modularity optimization and maximum likelihood methods for community detection. Phys Rev E 94:052315.
Newman, MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113.
Newman, MEJ, Reinert G (2016) Estimating the number of communities in a network. Phys Rev Lett 117:078301.
Ng, TJ, Murphy TB (2018) Model-based clustering for random hypergraphs. ArXiv e-prints. arXiv:1808.05185.
Nobile, A, Fearnside A (2007) Bayesian finite mixtures with an unknown number of components: the allocation sampler. Stat Comput 17:147–162.
Nowicki, K, Snijders TAB (2001) Estimation and prediction for stochastic blockstructures. J Am Stat Assoc 96(455):1077–1087.
Palla, K, Knowles DA, Ghahramani Z (2012) An infinite latent attribute model for network data In: Proceedings of the 29th International Conference on Machine Learning, Edinburgh.
Papaspiliopoulos, O, Roberts GO (2008) Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95(1):169–186.
Pathak, N, DeLong C, Banerjee A, Erickson K (2008) Social topic models for community extraction In: The 2nd SNA-KDD Workshop ’08’.
Peixoto, TP (2014) Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Phys Rev E 89:012804.
Peixoto, TP (2014) Hierarchical block structures and high-resolution model selection in large networks. Phys Rev X 4:011047.
Peixoto, TP (2015) Inferring the mesoscale structure of layered, edge-valued, and time-varying networks. Phys Rev E 92:042807.
Peixoto, TP (2015) Model selection and hypothesis testing for large-scale network models with overlapping groups. Phys Rev X 5:011033.
Peixoto, TP (2017) Bayesian stochatic blockmodeling. ArXiv e-prints. arXiv:1705.10225.
Peixoto, TP (2017) Nonparametric Bayesian inference of the microcanonical stochastic block model. Phys Rev E 95:012317.
Peixoto, TP (2018) Reconstructing networks with unknown and heterogeneous errors. Phys Rev X 8:041011.
Pons, P, Latapy M (2006) Computing communities in large networks using random walks. J Graph Algoritm Appl 10(2):191–218.
Priebe, CE, Sussman DL, Tang M, Vogelstein JT (2015) Statistical inference on errorfully observed graphs. J Comput Graph Stat 24(4):930–953.
Raghavan, UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76:036106.
Ranciati, S, Vinciotti V, Wit EC (2017) Identifying overlapping terrorist cells from the noordin top actor-event network. ArXiv e-prints. arXiv:1710.10319.
Ren, L, Dunson DB, Carin L (2008) The dynamic hierarchical Dirichlet process In: Proceedings of the 25th International Conference on Machine Learning, 824–831. https://doi.org/10.1145/1390156.1390260.
Rohe, K, Chatterjee S, Yu B (2011) Spectral clustering and the high-dimensional stochastic blockmodel. Ann Stat 39(4):1878–1915.
Rosvall, M, Axelsson D, Bergstrom CT (2009) The map equation. Eur Phys J 178:13–23.
Rosvall, M, Bergstrom CT (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Natl Acad Sci 104(18):7327–7331.
Sachan, M, Contractor D, Faruquie TA, Subramaniam LV (2012) Using content and interactions for discovering communities in social networks In: Proceedings of the 21st International Conference on World Wide Web, 331–340. https://doi.org/10.1145/2187836.2187882.
Sanna Passino, F, Heard NA (2019) Bayesian estimation of the latent dimension and communities in stochastic blockmodels. ArXiv e-prints. arXiv:1904.05333.
Schaub, MT, Delvenne J, Rosvall M, Lambiotte R (2014) The many facets of community detection in complex networks. Appl Netw Sci 2:4.
Snijders, TAB, Nowicki K (1997) Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J Classif 14(1):75–100.
Stanley, N, Bonacci T, Kwitt R, Niethammer M, Mucha PJ (2019) Stochastic block models with multiple continuous attributes. Appl Netw Sci 4(54). https://doi.org/10.1007/s41109-019-0170-z.
Tallberg, C (2005) A Bayesian approach to modeling stochastic blockstructures with covariates. J Math Sociol 29:1–23.
Tang, X, Yang CC (2011) Dynamic community detection with temporal Dirichlet process In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing, 603–608. https://doi.org/10.1109/passat/socialcom.2011.37.
Tang, X, Yang CC (2014) Detecting social media hidden communities using dynamic stochastic blockmodel with temporal Dirichlet process. ACM Trans Intell Syst Technol 5(2). https://doi.org/10.1145/2517085.
Tarrés-Deulofeu, M, Godoy-Lorite A, Guimerà R, Sales-Pardo M (2019) Tensorial and bipartite block models for link prediction in layered networks and temporal networks. Phys Rev E 99:032307.
Teh, YW, Jordan MI, Beal MJ, Blei DM (2005) Sharing clusters among related groups: Hierarchical Dirichlet processes. In: Saul LK, Weiss Y, Bottou L (eds)Advances in Neural Information Processing Systems 17 (NIPS 2004), 1385–1392.. MIT Press.
Valles-Català̀, T, Massucci FA, Guimerà R, Sales-Pardo M (2016) Multilayer stochastic block models reveal the multilayer structure of complex networks. Phys Rev X 6:011036.
Valles-Català̀, T, Peixoto TP, Sales-Pardo M, Guimerà R (2018) Consistencies and inconsistencies between model selection and link prediction in networks. Phys Rev E 97:062316.
Vu, DQ, Hunter DR, Schweinberger M (2013) Model-based clustering of large networks. Ann Appl Stat 7(2):1010–1039.
Wakita, K, Tsurumi T (2007) Finding community structure in mega-scale social networks In: Proceedings of the 16th International Conference on World Wide Web, 1275–1276. https://doi.org/10.1145/1242572.1242805.
Wang, C, Paisley J, Blei DM (2011) Online variational inference for the hierarchical Dirichlet process In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS)Š 2011, 752–760.. Volume 15 of JMLR: W&CP 15, Fort Lauderdale.
Wang, YXR, Bickel PJ (2017) Likelihood-based model selection for stochastic block models. Ann Stat 45(2):500–528.
White, HC, Boorman, Breiger RL (1976) Social structure from multiple networks. i. blockmodels of roles and positions. Am J Sociol 81(4):730–780.
Xie, J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Comput Surv 45(4):43:1–43:35. http://doi.acm.org/10.1145/2501654.2501657.
Xing, EP, Fu W, Song L (2010) A state-space mixed membership blockmodel for dynamic network tomography. Ann Appl Stat 4(2):535–566.
Xu, KS, Hero III AO (2013) Dynamic stochastic blockmodels: Statistical models for time-evolving networks In: Social Computing, Behavioral-Cultural Modeling and Prediction, 201–210. https://doi.org/10.1007/978-3-642-37210-0_22.
Yan, X (2016) Bayesian model selection of stochastic block models In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 323–328. https://doi.org/10.1109/asonam.2016.7752253.
Yan, X, Shalizi C, Jensen JE, Krzakala F, Moore C, Zdeborová L, Zhang P, Zhu Y (2014) Model selection for degree-corrected block models. J Stat Mech Theory Exp 5:P05007.
Yang, T, Chi Y, Zhu S, Gong Y, Jin R (2011) Detecting communities and their evolutions in dynamic social networks - a Bayesian approach. Mach Learn 82:157–189.
Zachary, WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473. https://www.jstor.org/stable/3629752.
Zhang, X, Moore C, Newman MEJ (2017) Random graph models for dynamics networks. Eur Phys J B 90:200.
Zhao, Y, Wu Y-J, Levina E, Zhu J (2017) Link prediction for partially observed networks. J Comput Graph Stat 26(3):725–733.
Zhou, D, Manavoglu E, Li J, Giles CL, Zha H (2006) Probabilistic models for discovering e-communities In: Proceedings of the 15th International Conference on World Wide Web. https://doi.org/10.1145/1135777.1135807.
Zhou, M (2015) Infinite edge partition models for overlapping community detection and link prediction In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS) 2015, 1135–1143.. JMLR: W&CP volume 38, San Diego.