Data mining of social networks represented as graphs

Computer Science Review - Tập 7 - Trang 1-34 - 2013
David Nettleton1,2
1IIIA-CSIC, Bellaterra, Spain
2Universitat Pompeu Fabra, Barcelona, Spain

Tóm tắt

Từ khóa


Tài liệu tham khảo

Easley, 2010

Cormen, 2001

West, 2001

Tomassini, 2010

Wasserman, 1994, 1

Farganis, 1993

Simmel, 1903

Levy Moreno, 1953

McPherson, 2001, Birds of a feather: homophily in social networks, Annual Review of Sociology, 27, 415, 10.1146/annurev.soc.27.1.415

Robins, 2005, Small and other worlds: global network structures from local processes, American Journal of Sociology (AJS), 110, 894, 10.1086/427322

Hauben, 1997

Hafner, 2001

O. El Akkad, So Long, GeoCities, The Globe and Mail, published Oct. 02 2009. Available at:http://www.theglobeandmail.com/technology/globe-on-technology/so-long-geocities/article790790/.

Boyd, 2007, Social network sites: Definition, history, and scholarship, Journal of Computer-Mediated Communication, 13, 10.1111/j.1083-6101.2007.00393.x

Knapp, 2006

Wilson, 2012, A review of facebook research in the social sciences, Perspectives on Psychological Science, 7, 203, 10.1177/1745691612442904

Zachary, 1977, An Information Flow Model for Conflict and Fission in Small Groups, Journal of Anthropological Research, 33, 452, 10.1086/jar.33.4.3629752

Lusseau, 2003, vol. 54, 396

Girvan, 2002, Community structure in social and biological networks, Proceedings National Academy of Sciences of the USA (PNAS), 99, 7821, 10.1073/pnas.122653799

Stanford Network Analysis Platform Datasets. Available at http://snap.stanford.edu/data/index.html.

Gehrke, 2003, Overview of the 2003 KDD Cup, SIGKDD Explorations, 5, 149, 10.1145/980972.980992

Leskovec, 2007, Graph evolution: densification and shrinking diameters, ACM Transactions on Knowledge Discovery from Data (ACM TKDD), 1

J. Shetty, J. Adibi, Discovering important nodes through graph entropy—the case of enron email database, in: Proc. 3rd Int. Workshop on Link Discovery, LinkKDD ’05, 2005, pp. 74–81.

M. Seshadri, S. Machiraju, A. Sridharan, J. Bolot, C. Faloutsos, J. Leskovec, Mobile call graphs: beyond power-law and lognormal distributions, in: Proc. 14th ACM SIGKDD, KDD ’08, New York, NY, USA, 2008, pp. 596–604.

M. Richardson, R. Agrawal, P. Domingos, Trust Management for the Semantic Web, in: Proc. 2nd Int. Semantic Web Conference, ISWC, 2003, pp. 351–368.

Backstrom, 2006, Group Formation in Large Social Networks: Membership, Growth, and Evolution, 44

Leskovec, 2010, Signed networks in social media, 1361

McAuley, 2012, Image labeling on a network: using social-network metadata for image classification, vol. 7575, 828

Yang, 2011, Temporal variation in online media, 177

Twitter development APIs. Available at https://dev.twitter.com/.

LinkedIn developer APIs. Available at http://developer.linkedin.com/apis.

M. Bastian, S. Heymann, M. Jacomy, Gephi: An Open Source Software for Exploring and Manipulating Networks, in: Proc. 3rd. Int. AAAI Conference on Weblogs and Social Media, 2009, pp. 361–362.

NetMiner 4, software tool for exploratory analysis and visualization of network data. Available at http://www.netminer.com.

‘Neo4J’ Graph Database System. Available at http://neo4j.org/.

A.A. Hagberg, D.A. Schult, P.J. Swart, Exploring Network Structure, Dynamics, and Function Using NetworkX, in: Proc. 7th Python in Science Conference, SciPy2008, Pasadena, CA, USA, 2008, pp. 11–15.

JUNG, Java Universal Network/Graph Framework. Available at http://jung.sourceforge.net/.

‘igraph’ library and API. Available at http://igraph.sourceforge.net/.

Stanford Network Analysis Platform (SNAP). Available at http://snap.stanford.edu/snap/index.html.

2007

Liben-Nowell, 2007, The link-prediction problem for social networks, Journal of the American Society for Information Science and Technology, 58, 1019, 10.1002/asi.20591

Ramamoorthy, 1966, Analysis of graphs by connectivity considerations, Journal of the ACM (JACM), 13, 211, 10.1145/321328.321332

H. Tong, S. Papadimitriou, J. Sun, P.S. Yu, C. Faloutsos, Colibri: fast mining of large static and dynamic graphs, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, New York, NY, USA, 2008, pp. 686–694.

V. Nair, D. Mahajan, S. Sellamanickam, A unified approach to learning task-specific bit vector representations for fast nearest neighbour search, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, Lyon, France, 2012, pp. 929–938.

D. Gibson, J.M. Kleinberg, P. Raghavan, Inferring Web Communities from Link Topology, in: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia (Hypertext ’98): Links, Objects, Time and Space, 1998, pp. 225–234.

J.M. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, A.S. Tomkins, The web as a graph: measurements, models, and methods, in: Proceedings of the 5th Annual International Conference on Computing and Combinatorics, COCOON’99, 1999, pp. 1–17.

Chakrabarti, 2006, Graph mining: Laws, generators, and algorithms, ACM Computing Surveys, 38, 10.1145/1132952.1132954

X. Yan, J. Han, gSpan: Graph-Based Substructure Pattern Mining, in: Proc. Second IEEE International Conference on Data Mining, ICDM’02, 2002, p. 721.

Inokuchi, 2000, vol. 1910/2000, 13

A. Mislove, M. Marcon, K.P. Gummadi, P. Druschel, B. Bhattacharjee, Measurement and Analysis of Online Social Networks, in: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, IMC ’07, 2007, pp. 29–42.

Eppstein, 2004, Fast Approximation of Centrality, Journal of Graph Algorithms and Applications, 8, 39, 10.7155/jgaa.00081

Newman, 2003, Why social networks are different from other types of networks, Physical Review E, 68, 036122, 10.1103/PhysRevE.68.036122

Dunbar, 1993, Coevolution of neocortical size, group size and language in humans, Behavioral and Brain Sciences, 16, 681, 10.1017/S0140525X00032325

Chakrabarti, 2004

B. Viswanath, A. Mislove, M. Cha, K.P. Gummadi, On the Evolution of User Interaction in Facebook, in: Proceedings of the 2nd ACM workshop on Online Social Networks WOSN’09, Barcelona, Spain, 2009, pp. 37–42.

Kossinets, 2006, Empirical analysis of an evolving social network, Science, 311, 88, 10.1126/science.1116869

L. Tang, H. Liu, J. Zhang, N. Nazeri, Community evolution in dynamic multi-mode networks, in: Proc. of the 14th ACM SIGKDD, KDD ’08, New York, NY, USA, 2008, pp. 677–685.

J. Leskovec, J. Kleinberg, C. Faloutsos, Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations, in: Proc. KDD ’05, 11th ACM SIGKDD Int. Conf. of Knowledge Discovery and Data Mining, 2005, pp. 177–187.

M. McGlohon, L. Akoglu, C. Faloutsos, Weighted Graphs and Disconnected Components: Patterns and a Generator, in: Proc. 14th ACM SIGKDD Int. Conf. of Knowledge Discovery and Data Mining, KDD ’08, 2008, pp. 524–532.

Kumar, 2010, Structure and Evolution of Online Social Networks, 337

Randic, 1997, Dense graphs and sparse matrices, Journal of Chemical Information and Computer Sciences, 37, 1078, 10.1021/ci970241z

Newman, 2003, The structure and function of complex networks, SIAM Review, 45, 167, 10.1137/S003614450342480

W. Hwang, T. Kim, M. Ramanathan, A. Zhang, Bridging centrality: graph mining from element level to group level, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, New York, NY, USA, 2008, pp. 336–344.

Y.-Y. Ahn, S. Han, H. Kwak, Y.-H. Eom, S. Moon, H. Jeong, Analysis of Topological Characteristics of Huge Online Social Networking Services, in: Proc. 16th International Conference on World Wide Web, WWW ’07, 2007, pp. 835–844.

B. Goncalves, N. Perra, A. Vespignani, Validation of Dunbar’s number in Twitter conversations. arXiv.org-physics-arXiv:1105.5170, 28 May 2011 (http://arxiv.org/abs/1105.5170).

J.M. Kleinberg, The Small-World Phenomenon: An Algorithmic Perspective, in: Proceedings of the Thirty-second Annual ACM Symposium on Theory of Computing, STOC ’00, 2000, pp. 163–170.

Broder, 2000, Graph structure in the web, computer networks, International Journal of Computer and Telecommunications Networking, 33, 309

Feigenbaum, 2005, On graph problems in a semi-streaming model, Theoretical Computer Science, 348, 207, 10.1016/j.tcs.2005.09.013

C. Demetrescu, I. Finocchi, A. Ribichini, Trading off space for passes in graph streaming problems, in: ACM–SIAM Symposium on Discrete Algorithms, 2006, pp. 714–723.

A. Das Sarma, S. Gollapudi, R. Panigrahy, Estimating PageRank on graph streams, in: ACM Symposium on Principles of Database Systems, 2008, pp. 69–78.

Goodman, 1961, Snowball sampling, Annals of Mathematical Statistics, 32, 148, 10.1214/aoms/1177705148

Snijders, 1992, Estimation on the basis of snowball samples: how to weight?, Bulletin de Méthodologie Sociologique, 59, 10.1177/075910639203600104

T. Shafie, Design-based Estimators for Snowball Sampling, Workshop on Survey Sampling Theory and Methodology, Vilnius, Lithuania, 2010, August 23–27. Available at http://vilniusworkshop2010.stat.gov.lt/Straipsniai/Shafie_T.pdf.

K. Bartz, J. Blitzstein, J. Liu, Graphs, Bridges and Snowballs: Monte Carlo Maximum Likelihood for Exponential Random Graph Models, 2009, presentation. Available at http://www.kevinbartz.com/uploads/graph/presentation.pdf.

Snijders, 2010, Conditional marginalization for exponential random graph models, Journal of Mathematical Sociology, 34, 239, 10.1080/0022250X.2010.485707

Dijkstra, 1959, A note on two problems in connexion with graphs, Numerische Mathematik, 1, 269, 10.1007/BF01386390

Lu, 2006, A shortest path searching method with area limitation heuristics, vol. 3991, 884

L.S. Buriol, G. Frahling, S. Leonardi, A. Marchetti-Spaccamela, C. Sohler, Counting Triangles in Data Streams, in: ACM Symposium on Principles of Database Systems, 2006, pp. 253–262.

H. Tong, C. Faloutsos, Center-piece sub graphs: problem definition and fast solutions, in: Proceedings of the Twelfth ACM SIGKDD International Conference on KDDM, 2006, pp. 404–413.

Cordella, 1999, Evaluating Performance of the VF Graph Matching Algorithm, 1172

L.P. Cordella, P. Foggia, C. Sansone, M. Vento, An Improved Algorithm for Matching Large Graphs, in: Proc. 3rd IAPR-TC-15 International Workshop on Graph based Representations, Cuen, Italy, 2001, pp. 149–159.

Washio, 2003, State of the art of graph-based data mining, ACM SIGKDD Explorations Newsletter, 5, 59, 10.1145/959242.959249

Kleinberg, 1999, Authoritative sources in a hyperlinked environment, Journal of the ACM (JACM), 46, 604, 10.1145/324133.324140

D. Kempe, J. Kleinberg, E. Tardos, Maximizing the spread of influence through a social network, in: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’03, 2003, pp. 137–146.

Körner, 1986, Bounds and information theory, SIAM Journal on Algorithms and Discrete Mathematics, 560, 10.1137/0607062

Jaccard, 1901, Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines, Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 241

Freeman, 1977, A set of measures of centrality based on betweenness, Sociometry, 40, 35, 10.2307/3033543

Newman, 2004, Finding and evaluating community structure in networks, Physical Review E, 69, 026113, 10.1103/PhysRevE.69.026113

Blondel, 2008, Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment, 1000

Newman, 2006, Modularity and community structure in networks, Proceedings of the National Academy of Sciences, 103, 8577, 10.1073/pnas.0601602103

J.M. Kleinberg, Challenges in mining social network data: processes, privacy, and paradoxes, in: Proc. 13th Int. Conf. on Knowledge Discovery and Data Mining, KDD ’07, 2007, pp. 4–5.

Martínez Arqué, 2012, Analysis of on-line social networks represented as graphs–extraction of an approximation of community structure using sampling, vol. 7647, 149

Xie, 2011

Leskovec, 2009, Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters, Internet Mathematics, 6, 29, 10.1080/15427951.2009.10129177

A. Pal, S. Chang, J.A. Konstan, Evolution of Experts in Question Answering Communities, in: Proc. 6th Int. AAAI Conf. on Weblogs and Social Media, Dublin, Ireland, 4–7 June, 2012, pp. 274–281.

V. Belák, S. Lam, C. Hayes, Cross-Community Influence in Discussion Fora, in: Proc. 6th Int. AAAI Conf. on Weblogs and Social Media, Dublin, Ireland, 4–7 June, 2012, pp. 34–41.

W. Lin, X. Kong, P. Yu, Q. Wu, Y. Jia, C. Li, Community Detection in Incomplete Information Networks, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 341–349.

I. Kash, J. Lai, H. Zhang, A. Zohar, Economics of BitTorrent Communities, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 221–230.

M. Sachan, D. Contractor, T. Faruquie, L.V. Subramaniam, Using Content and Interactions for Discovering Communities in Social Networks, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 331–340.

S. Alsaleh, R. Nayak, Y. Xu, Grouping People in Social Networks Using a Weighted Multi-Constraints Clustering Method, WCCI 2012 IEEE World Congress on Computational Intelligence June, 10–15, 2012—Brisbane, Australia, in: Proc. Int Conf. on Fuzzy Systems, FUZZ IEEE 2012, pp. 243–250.

B. Li, M.R. Lyu, I. King, Communities of Yahoo! Answers and Baidu Zhidao: Complementing or Competing? WCCI 2012 IEEE World Congress on Computational Intelligence June, 10–15, 2012—Brisbane, Australia, in: Proc. Int. Joint Conf. on Neural Networks, IJCNN 2012, pp. 524–531.

P. Krömer, V. Snásel, J. Platos, M. Kudelka, Z. Horák, An ACO Inspired Weighting Approach for the Spectral Partitioning of Co-authorship Networks, WCCI 2012 IEEE World Congress on Computational Intelligence, June, 10–15, 2012—Brisbane, Australia, in: Proc. Congress on Evolutionary Computation, IEEE CEC 2012, pp. 2477–2483.

C. Danescu-Niculescu-Mizil, L. Lee, B. Pang, J. Kleinberg, Echoes of power: How power differences between people are revealed by linguistic style coordination, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 699–708.

M. Ott, C. Cardie, J. Hancock, Estimating the Prevalence of Deception in Online Review Communities, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 201–210.

S. Ranu, V. Chaoji, R. Rastogi, R. Bhatt, Recommendations to Boost Content Spread in Social Networks, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 530–538.

A. Bonato, J. Janssen, P. Pralat, A Geometric Model for On-line Social Networks, in: Proc. 3rd Workshop on Online Social Networks, WOSN 2010, Boston, MA, USA. http://www.usenix.org/events/wosn10/tech/full_papers/Bonato.pdf.

S. Scellato, C. Mascolo, M. Musolesi, V. Latora, Distance Matters: Geo-social Metrics for Online Social Networks, in: Proc. 3rd Workshop on Online Social Networks, WOSN 2010, Boston, MA, USA. http://www.usenix.org/events/wosn10/tech/full_papers/Scellato.pdf.

X. Zhao, H. Zheng, Orion: Shortest Path Estimation for Large Social Graphs, in: Proc. 3rd Workshop on Online Social Networks, WOSN 2010, Boston, MA, USA. http://www.usenix.org/events/wosn10/tech/full_papers/zhao.pdf.

S. Ghosh, G. Korlam, N. Ganguly, The Effects of Restrictions on Number of Connections in OSNs: A Case-Study on Twitter, in: Proc. 3rd Workshop on Online Social Networks, WOSN 2010, Boston, MA, USA. http://www.usenix.org/events/wosn10/tech/full_papers/Ghosh.pdf.

F. Kooti, H. Yang, M. Cha, K.P. Gummadi, W.A. Mason, The Emergence of Conventions in Online Social Networks, in: Proc. 6th Int. AAAI Conf. on Weblogs and Social Media, Dublin, Ireland, 4–7 June, 2012, pp. 194–201.

M. Wattenhofer, R. Wattenhofer, Z. Zhu, The YouTube Social Network, in: Proc. 6th Int. AAAI Conf. on Weblogs and Social Media, Dublin, Ireland, 4–7 June, 2012, pp. 354–361.

P. Dandekar, B. Wiedenbeck, A. Goel, M. Wellman, Strategic Formation of Credit Networks, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 559–568.

T. Kamei, K. Ono, M. Kumano, M. Kimura, Predicting Missing Links in Social Networks with Hierarchical Dirichlet Processes, WCCI 2012 IEEE World Congress on Computational Intelligence, June, 10–15, 2012—Brisbane, Australia, in: Proc. Int. Joint Conf. on Neural Networks, IJCNN 2012, pp. 1816–1823.

S. Miyamoto, S. Suzuki, S. Takumi, Clustering in Tweets Using a Fuzzy Neighborhood Model, WCCI 2012 IEEE World Congress on Computational Intelligence June, 10–15, 2012—Brisbane, Australia, in: Proc. Int Conf. on Fuzzy Systems, FUZZ IEEE 2012, pp. 251–256.

H. Gao, J. Tang, H. Liu, Exploring Social–Historical Ties on Location-Based Social Networks, in: Proc. 6th Int. AAAI Conf. on Weblogs and Social Media, Dublin, Ireland, 4–7 June, 2012, pp. 114–121.

S. Adali, F. Sisenda, M. Magdon-Ismail, Actions speak as loud as words: Predicting relationships from social behavior data, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 689–698.

A. Anagnostopoulos, L. Becchetti, C. Castillo, A. Gionis, S. Leonardi, Online team formation in social networks, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 839–848.

Dunn, 2012

R. Agrawal, M. Potamias, E. Terzi, Learning the Nature of Information in Social Networks, in: Proc. 6th Int. AAAI Conf. on Weblogs and Social Media, Dublin, Ireland, 4–7 June, 2012, pp. 2–9.

C. Yang, R. Harkreader, J. Zhang, S. Shin, G. Gu, Analyzing Spammers’ Social Networks For Fun and Profit, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 71–80.

E. Bakshy, I. Rosenn, C. Marlow, L. Adamic, The Role of Social Networks in Information Diffusion, in: Proc. World Wide Web 2012, WWW 2012, April 16–20, 2012, Lyon, France, pp. 519–528.

M-J. Lesot, F. Nely, T. Delavalladey, P. Capety, B. Bouchon-Meunier, Two Methods for Internet Buzz Detection Exploiting the Citation Graph, WCCI 2012 IEEE World Congress on Computational Intelligence, June, 10–15, 2012—Brisbane, Australia, in: Proc. Int Conf. on Fuzzy Systems, FUZZ IEEE 2012, pp. 1368–1375.

Dawkins, 1989

Leskovec, 2009, Meme-tracking and the dynamics of the news cycle, 497

J. Poschko, Exploring Twitter Hashtags, 2010. Available athttp://twex.poeschko.com/media/files/ExploringTwitterHashtags.pdf.

BigML, Machine Learning for Everyone. Available at https://bigml.com/.