Compact representation of Web graphs with extended functionality

Information Systems - Tập 39 - Trang 152-174 - 2014
Nieves R. Brisaboa1, Susana Ladra1, Gonzalo Navarro2
1Database Laboratory, University of A Coruña, Spain
2Department of Computer Science, University of Chile, Chile

Tài liệu tham khảo

D. Donato, S. Millozzi, S. Leonardi, P. Tsaparas, Mining the inner structure of the Web graph, in: Proceedings of 8th Workshop on the Web and Databases (WebDB), 2005, pp. 145–150. J. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, The Web as a graph: measurements, models, and methods, in: Proceedings of 5th Annual International Conference on Computing and Combinatorics (COCOON), Lecture Notes in Computer Sciences, vol. 1627, 1999, pp. 1–17. Brin, 1998, The anatomy of a large-scale hypertextual Web search engine, Computer Networks, 30, 107, 10.1016/S0169-7552(98)00110-X L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank Citation Ranking: Bringing Order to the Web, Technical Report 1999–66, Stanford InfoLab, 1999. Becchetti, 2008, Link analysis for Web spam detection, ACM Transactions on the Web, 2, 10.1145/1326561.1326563 Henzinger, 2000, On near-uniform URL sampling, Computer Networks, 33, 295, 10.1016/S1389-1286(00)00055-4 H. Saito, M. Toyoda, M. Kitsuregawa, K. Aihara, A large-scale study of link spam detection by graph algorithms, in: Proceedings of 3rd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), 2007, p. 48. A. Benczúr, K. Csalogány, T. Sarlós, M. Uher, Spamrank: fully automatic link spam detection, in: Proceedings of 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), 2005. B. Wu, B. Davison, Identifying link farm spam pages, in: Proceedings of 14th International World Wide Web Conference (WWW), Special Interest Tracks and Posters, 2005, pp. 820–829. Kumar, 1999, Trawling the Web for emerging cyber-communities, Computer Networks, 31, 1481, 10.1016/S1389-1286(99)00040-7 G. Buehrer, K. Chellapilla, A scalable pattern mining approach to Web graph compression with communities, in: Proceedings of 1st ACM International Conference on Web Search and Data Mining (WSDM), 2008, pp. 95–106. D. Gibson, J. Kleinberg, P. Raghavan, Inferring Web communities from link topology, in: Proceedings of 9th ACM Conference on Hypertext and Hypermedia, 1998, pp. 225–234. J. Leskovec, K.J. Lang, M. Mahoney, Empirical comparison of algorithms for network community detection, in: Proceedings of the 19th International Conference on World Wide Web, WWW'10, ACM, New York, NY, USA, 2010, pp. 631–640. Vitter, 2001, External memory algorithms and data structures, ACM Computing Surveys, 33, 209, 10.1145/384192.384193 N. Brisaboa, S. Ladra, G. Navarro, K2-trees for compact web graph representation, in: Proceedings of 16th International Symposium on String Processing and Information Retrieval (SPIRE), Lecture Notes in Computer Sciences, vol. 5721, 2009, pp. 18–30. P. Boldi, S. Vigna, The WebGraph framework I: compression techniques, in: Proceedings of 13th International World Wide Web Conference (WWW), 2004, pp. 595–601. K. Bharat, A. Broder, M. Henzinger, P. Kumar, S. Venkatasubramanian, The connectivity server: fast access to linkage information on the Web, in: Proceedings of 7th International World Wide Web Conference (WWW), 1998, pp. 469–477. Broder, 2000, Graph structure in the Web, Computer Networks, 33, 309, 10.1016/S1389-1286(00)00083-9 M. Adler, M. Mitzenmacher, Towards compressing Web graphs, in: Proceedings of 11th Data Compression Conference (DCC), 2001, pp. 203–212. T. Suel, J. Yuan, Compressing the graph structure of the Web, in: Proceedings of 11th Data Compression Conference (DCC), 2001, pp. 213–222. K. Randall, R. Stata, R. Wickremesinghe, J. Wiener, The LINK Database: Fast Access to Graphs of the Web, Technical Report 175, Compaq Systems Research Center, Palo Alto, CA, 2001. S. Raghavan, H. Garcia-Molina, Representing Web graphs, in: Proceedings of 19th International Conference on Data Engineering (ICDE), 2003, p. 405. Boldi, 2009, Permuting Web and social graphs, Internet Mathematics, 6, 257, 10.1080/15427951.2009.10390641 P. Boldi, M. Rosa, M. Santini, S. Vigna, Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks, in: Proceedings of 20th International World Wide Web Conference (WWW), 2011, pp. 587–596. F. Claude, G. Navarro, Fast and compact Web graph representations, ACM Transactions on the Web (TWEB) 4 (2010) article 16. Larsson, 2000, Off-line dictionary-based compression, Proceedings of the IEEE, 88, 1722, 10.1109/5.892708 F. Claude, G. Navarro, Extended compact Web graph representations, in: T. Elomaa, H. Mannila, P. Orponen (Eds.), Algorithms and Applications (Ukkonen Festschrift), Lecture Notes in Computer Sciences, vol. 6060, 2010, pp. 77–91. Y. Asano, Y. Miyawaki, T. Nishizeki, Efficient compression of Web graphs, in: Proceedings of 14th Annual International Conference on Computing and Combinatorics (COCOON), Lecture Notes in Computer Science, vol. 5092, 2008, pp. 1–11. Apostolico, 2009, Graph compression by BFS, Algorithms, 2, 1031, 10.3390/a2031031 C. Hernández, G. Navarro, Compression of Web and social graphs supporting neighbor and community queries, in: Proceedings of 5th ACM Workshop on Social Network Mining and Analysis (SNA-KDD), 2011. S. Grabowski, W. Bieniecki, Merging adjacency lists for efficient web graph compression, in: Man–Machine Interactions 2 AISC 103, 2011, pp. 385–392. N. Brisaboa, R. Cánovas, F. Claude, M. Martínez-Prieto, G. Navarro, Compressed string dictionaries, in: Proceedings of 10th International Symposium on Experimental Algorithms (SEA), Lecture Notes in Computer Sciences, vol. 6630, 2011, pp. 136–147. Samet, 2006 I. Simecek, Sparse matrix computations using the quadtree storage format, in: Proceedings of the 11th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2009, pp. 168–173. G. Jacobson, Space-efficient static trees and graphs, in: Proceedings of 30th IEEE Symposium on Foundations of Computer Science (FOCS), 1989, pp. 549–554. Munro, 2001, Succinct representation of balanced parentheses and static trees, SIAM Journal on Computing, 31, 762, 10.1137/S0097539799364092 Benoit, 2005, Representing trees of higher degree, Algorithmica, 43, 275, 10.1007/s00453-004-1146-6 Geary, 2006, A simple optimal representation for balanced parentheses, Theoretical Computer Science (TCS), 368, 231, 10.1016/j.tcs.2006.09.014 K. Sadakane, G. Navarro, Fully-functional succinct trees, in: Proceedings of 21st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2010, pp. 134–149. J. Jansson, K. Sadakane, W.-K. Sung, Ultra-succinct representation of ordered trees, in: Proceedings of 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2007, pp. 575–584. D. Arroyuelo, R. Cánovas, G. Navarro, K. Sadakane, Succinct trees in practice, in: Proceedings of 11th Workshop on Algorithm Engineering and Experiments (ALENEX), 2010, pp. 84–97. I. Munro, Tables, in: Proceedings of 16th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), Lecture Notes in Computer Sciences, vol. 1180, 1996, pp. 37–42. R. González, S. Grabowski, V. Mäkinen, G. Navarro, Practical implementation of rank and select queries, in: Poster Proceedings of Volume of 4th Workshop on Efficient and Experimental Algorithms (WEA), 2005, pp. 27–38. S. Gog, M. Petri, Optimized succinct data structures for massive data, Software: Practice and Experience. DOI: http://dx.doi.org/10.1002/spe.2198, in press. Mehlhorn, 1984 M. He, I. Munro, Succinct representations of dynamic strings, in: Proceedings of 17th International Symposium on String Processing and Information Retrieval (SPIRE), Lecture Notes in Computer Sciences, vol. 6393, 2010, pp. 334–346. G. Navarro, K. Sadakane, Fully-functional static and dynamic succinct trees, ACM Transactions on Algorithms (TALG), in press. Brisaboa, 2013, DACs, Information Processing and Management, 49, 392, 10.1016/j.ipm.2012.08.003 Boldi, 2004, Ubicrawler, Software, 34, 711 F. Claude, S. Ladra, Practical representations for Web and social graphs, in: Proceedings of 20th ACM Conference on Information and Knowledge Management (CIKM), 2011, pp. 1185–1190. C. Hernández, G. Navarro, Compressed representation of web and social networks via dense subgraphs, in: Proceedings of 19th International Symposium on String Processing and Information Retrieval (SPIRE), Lecture Notes in Computer Sciences, vol. 7608, 2012, pp. 264–276. Shi, 2012, Optimizing K2-trees, Computers and Mathematics with Applications, 63, 427, 10.1016/j.camwa.2011.07.060 S. Álvarez-García, N.R. Brisaboa, J. Fernández, M. Martínez-Prieto, Compressed k2-triples for full-in-memory RDF engines, in: Proceedings of 17th Americas Conference on Information Systems (AMCIS), 2011. S. Álvarez-García, N. Brisaboa, S. Ladra, O. Pedreira, A compact representation of graph databases, in: Proceedings of 8th Workshop on Mining and Learning with Graphs (MLG), ACM, 2010, pp. 18–25. J. Barbay, F. Claude, G. Navarro, Compact rich-functional binary relation representations, in: Proceedings of 9th Latin American Symposium on Theoretical Informatics (LATIN), Lecture Notes in Computer Sciences, vol. 6034, 2010, pp. 170–183. M. Romero, N. Brisaboa, M.A. Rodríguez, The smo-index: a succinct moving object structure for timestamp and interval queries, in: Proceedings of the 20th International Conference on Advances in Geographic Information Systems, SIGSPATIAL '12, ACM, 2012, pp. 498–501. N. Brisaboa, G. de Bernardo, G. Navarro, Compressed dynamic binary relations, in: Proceedings of 22th Data Compression Conference (DCC), 2012, pp. 52–61.