Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization

Rundong Du1, Da Kuang2, Barry Drake3, Haesun Park3
1School of Mathematics, Georgia Institute of Technology, 686 Cherry Street, Atlanta, GA 30332-0160, USA
2Department of Mathematics, University of California, Los Angeles, 520 Portola Plaza, Los Angeles, CA, 90095-1555, USA
3School of Computational Science and Engineering, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA, 30332-0765, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Yang J, Leskovec J. Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the sixth ACM international conference on web search and data mining. WSDM ’13. New York: ACM; 2013. p. 587–96. doi: 10.1145/2433396.2433471 .

Whang JJ, Gleich DF, Dhillon IS. Overlapping community detection using seed set expansion. In: Proceedings of the 22nd ACM international conference on information & knowledge management. CIKM ’13. New York: ACM; 2013. p. 2099–108. doi: 10.1145/2505515.2505535 .

Whang JJ, Gleich DF, Dhillon IS. Overlapping community detection using neighborhood-inflated seed expansion. IEEE Trans Knowl Data Eng. 2016;28(5):1272–84. doi: 10.1109/TKDE.2016.2518687 .

Yang J, Leskovec J. Defining and evaluating network communities based on ground-truth. Knowl Inf Syst. 2013;42(1):181–213. doi: 10.1007/s10115-013-0693-z .

Kernighan BW, Lin S. An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J. 1970;49(2):291–307. doi: 10.1002/j.1538-7305.1970.tb01770.x .

Fortunato S. Community detection in graphs. Phys Rep. 2010;486(3–5):75–174. doi: 10.1016/j.physrep.2009.11.002 .

Karypis G, Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput. 1998;20(1):359–92. doi: 10.1137/S1064827595287997 .

Pellegrini F, Roman J. SCOTCH: a software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In: Proceedings of the international conference and exhibition on high-performance computing and networking. HPCN Europe 1996. London: Springer; 1996. p. 493–8.

Hendrickson B, Leland R. An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J Sci Comput. 1995;16(2):452–69. doi: 10.1137/0916028 .

Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell. 2000;22(8):888–905. doi: 10.1109/34.868688 .

Dhillon IS, Guan Y, Kulis B. Weighted graph cuts without eigenvectors a multilevel approach. IEEE Trans Pattern Anal Mach Intell. 2007;29(11):1944–57. doi: 10.1109/TPAMI.2007.1115 .

Dhillon IS, Guan Y, Kulis B. Kernel K-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’04. New York: ACM; 2004. p. 551–6. doi: 10.1145/1014052.1014118 .

Kuang D, Ding C, Park H. Symmetric nonnegative matrix factorization for graph clustering. In: Proceedings of the 2012 SIAM international conference on data mining. Philadelphia: Society for Industrial and Applied Mathematics; 2012. p. 106–17.

Kuang D, Yun S, Park H. SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering. J Glob Optim. 2014;62(3):545–74. doi: 10.1007/s10898-014-0247-2 .

Girvan M, Newman MEJ. Community structure in social and biological networks. Proc Natl Acad Sci. 2002;99(12):7821–6. doi: 10.1073/pnas.122653799 .

Newman MEJ, Girvan M. Finding and evaluating community structure in networks. Phys Rev E. 2004;69(2):026113. doi: 10.1103/PhysRevE.69.026113 .

Prat-Pérez A, Dominguez-Sal D, Larriba-Pey J-L. High quality, scalable and parallel community detection for large real graphs. In: Proceedings of the 23rd international conference on world wide web. WWW ’14. New York: ACM; 2014. p. 225–36. doi: 10.1145/2566486.2568010 .

Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008;2008(10):10008. doi: 10.1088/1742-5468/2008/10/P10008 .

Pons P, Latapy M. Computing communities in large networks using random walks. J Graph Algorithms Appl. 2006;10(2):191–218. doi: 10.7155/jgaa.00124 .

Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci. 2008;105(4):1118–23. doi: 10.1073/pnas.0706851105 .

Baumes J, Goldberg M, Krishnamoorty M, Magdon-Ismail M, Preston N. Finding communities by clustering a graph into overlapping subgraphs. In: Guimaraes N, Isaias P, editors. Proceedings of the IADIS international conference on applied computing. Applied computing, vol. 1. Lisbon: IADIS; 2005. p. 97–104.

Palla G, Derényi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005;435(7043):814–8. doi: 10.1038/nature03607 .

Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S. Finding statistically significant communities in networks. PLoS ONE. 2011;6(4):18961. doi: 10.1371/journal.pone.0018961 .

Yang J, Leskovec J. Structure and overlaps of ground-truth communities in networks. ACM Trans Intell Syst Technol. 2014;5(2):26–12635. doi: 10.1145/2594454 .

Kuang D, Park H. Fast rank-2 nonnegative matrix factorization for hierarchical document clustering. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’13. New York: ACM; 2013. p. 739–47. doi: 10.1145/2487575.2487606 .

Xu W, Liu X, Gong Y. Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. SIGIR ’03. New York: ACM; 2003. p. 267–73. doi: 10.1145/860435.860485 .

Gillis N, Kuang D, Park H. Hierarchical clustering of hyperspectral images using rank-two nonnegative matrix factorization. IEEE Trans Geosci Remote Sens. 2015;53(4):2066–78. doi: 10.1109/TGRS.2014.2352857 .

Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401(6755):788–91. doi: 10.1038/44565 .

Drake B, Kim J, Mallick M, Park H. Supervised Raman spectra estimation based on nonnegative rank deficient least squares. In: Proceedings 13th international conference on information fusion, Edinburgh, UK. 2010.

Kim J, Park H. Fast nonnegative matrix factorization: an active-set-like method and comparisons. SIAM J Sci Comput. 2011;33(6):3261–81. doi: 10.1137/110821172 .

Ho N-D. Nonnegative matrix factorization algorithms and applications. PhD thesis, ÉCOLE POLYTECHNIQUE. 2008.

Du R, Kuang D, Drake B, Park H. DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling. J Glob Optim. 2017;68(4):777–98. doi: 10.1007/s10898-017-0515-z .

Sinclair A, Jerrum M. Approximate counting, uniform generation and rapidly mixing Markov chains. Inf Comput. 1989;82(1):93–133. doi: 10.1016/0890-5401(89)90067-9 .

Leskovec J, Krevl A. SNAP datasets: Stanford large network dataset collection. 2014. http://snap.stanford.edu/data .

Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B. Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM conference on internet measurement. IMC ’07. New York: ACM; 2007. p. 29–42. doi: 10.1145/1298306.1298311 .

Backstrom L, Huttenlocher D, Kleinberg J, Lan X. Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’06. New York: ACM; 2006. p. 44–54. doi: 10.1145/1150402.1150412 .

Leskovec J, Adamic LA, Huberman BA. The dynamics of viral marketing. ACM Trans Web. 2007;1(1):5. doi: 10.1145/1232722.1232727 .

dblp: What is dblp? 2015. http://dblp.uni-trier.de/faq/What+is+dblp.html . Accessed 29 June 2015.

dblp: dblp: computer science bibliography. 2015. http://dblp.uni-trier.de/ . Accessed 17 June 2015.

dblp: What do I find in dblp.xml? 2015. http://dblp.uni-trier.de/faq/What+do+I+find+in+dblp+xml.html . Accessed 30 June 2015.

Graham RL, Knuth DE, Patashnik O. Concrete mathematics: a foundation for computer science, 2nd ed. Chap. 2.2: sums and recurrences. Boston: Addison-Wesley Longman Publishing Co., Inc.; 1994. p. 24.

Ding C, He X, Simon HD. On the equivalence of nonnegative matrix factorization and spectral clustering. In: SIAM international conference on data mining. 2005.