SocialLink: exploiting graph embeddings to link DBpedia entities to Twitter profiles

Progress in Artificial Intelligence - Tập 7 - Trang 251-272 - 2018
Yaroslav Nechaev1,2, Francesco Corcoglioniti1, Claudio Giuliano1
1Fondazione Bruno Kessler, Trento, Italy
2University of Trento, Trento, Italy

Tóm tắt

SocialLink is a project designed to match social media profiles on Twitter to corresponding entities in DBpedia. Built to bridge the vibrant Twitter social media world and the Linked Open Data cloud, SocialLink enables knowledge transfer between the two, both assisting Semantic Web practitioners in better harvesting the vast amounts of information available on Twitter and allowing leveraging of DBpedia data for social media analysis tasks. In this paper, we further extend the original SocialLink approach by exploiting graph-based features based on both DBpedia and Twitter, represented as graph embeddings learned from vast amounts of unlabeled data. The introduction of such new features required to redesign our deep neural network-based candidate selection algorithm and, as a result, we experimentally demonstrate a significant improvement of the performances of SocialLink.

Tài liệu tham khảo

Aprosio, A.P., Giuliano, C., Lavelli, A.: Automatic expansion of DBpedia exploiting Wikipedia cross-language information. In: Proceedings of the Semantic Web: Semantics and Big Data, 10th International Conference, ESWC 2013, Montpellier, France, May 26–30, 2013. Lecture Notes in Computer Science, vol. 7882, pp. 397–411. Springer, Berlin (2013). https://doi.org/10.1007/978-3-642-38288-8_27 Besel, C., Schlötterer, J., Granitzer, M.: Inferring semantic interest profiles from Twitter followees: Does Twitter know better than your friends? In: ACM SAC, pp. 1152–1157 (2016) Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017) Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Biased graph walks for RDF graph embeddings. In: Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017, pp. 21:1–21:12 (2017) Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Global RDF vector space embeddings. In: The Semantic Web-16th International Semantic Web Conference ISWC 2017, Vienna, Austria, October 21-25, 2017, Proceedings, Part I, Lecture Notes in Computer Science, vol. 10587, pp. 190–207. Springer (2017). https://doi.org/10.1007/978-3-319-68288-4_12 Corcoglioniti, F., Giuliano, C., Nechaev, Y., Zanoli, R.: Pokedem: An automatic social media management application. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys ’17, pp. 358–359. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3109859.3109980 Corcoglioniti, F., Palmero Aprosio, A., Nechaev, Y., Giuliano, C.: MicroNeel: Combining NLP tools to perform named entity detection and linking on microposts. In: EVALITA (2016) Corcoglioniti, F., Rospocher, M., Mostarda, M., Amadori, M.: Processing billions of RDF triples on a single machine using streaming and sorting. In: ACM SAC, pp. 368–375 (2015) Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent semantic kernels. J. Intell. Inf. Syst. 18(2), 127–152 (2002). https://doi.org/10.1023/A:1013625426931 Erxleben, F., Günther, M., Krötzsch, M., Mendez, J., Vrandeăić, D.: Introducing wikidata to the linked data web. In: Proceedings of the 13th International Semantic Web Conference-Part I, ISWC ’14, pp. 50–65. Springer, New York, NY, USA (2014). https://doi.org/10.1007/978-3-319-11964-9_4 Faralli, S., Stilo, G., Velardi, P.: Large scale homophily analysis in twitter using a twixonomy. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, pp. 2334–2340 (2015) Fetahu, B., Anand, A., Anand, A.: How much is Wikipedia lagging behind news? In: Proceedings of the ACM Web Science Conference, WebSci ’15, pp. 28:1–28:9. ACM, New York, NY, USA (2015). https://doi.org/10.1145/2786451.2786460 Goga, O.: Matching user accounts across online social networks: methods and applications. Ph.D. thesis, LIP6-Laboratoire d’Informatique de Paris 6 (2014) Goga, O., Lei, H., Parthasarathi, S.H.K., Friedland, G., Sommer, R., Teixeira, R.: Exploiting innocuous activity for correlating users across sites. In: Proceedings of the WWW, pp. 447–458. ACM (2013) Goga, O., Loiseau, P., Sommer, R., Teixeira, R., Gummadi, K.P.: On the reliability of profile matching across large online social networks. In: Proceedings of KDD, pp. 1799–1808. ACM (2015) Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: a survey (2017). arXiv:1705.02801 Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: The 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pp. 855–864. ACM (2016) Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from wikipedia. Artifi. Intell. 194, 28–61 (2013). https://doi.org/10.1016/j.artint.2012.06.001 Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998) Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia—a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015). https://doi.org/10.3233/SW-140134 Liu, S., Wang, S., Zhu, F., Zhang, J., Krishnan, R.: HYDRA: Large-scale social identity linkage via heterogeneous behavior modeling. In: Proceedings of SIGMOD, pp. 51–62. ACM (2014) Lu, C.T., Shuai, H.H., Yu, P.S.: Identifying your customers in social networks. In: Proceedings of CIKM, pp. 391–400. ACM (2014) Minard, A., Qwaider, M.R.H., Magnini, B.: FBK-NLP at NEEL-IT: active learning for domain adaptation. In: EVALITA (2016) Nechaev, Y., Corcoglioniti, F., Giuliano, C.: Concealing interests of passive users in social media. In: Proceedings of the Re-coding Black Mirror 2017 Workshop co-located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, 22 Oct 2017 (2017) Nechaev, Y., Corcoglioniti, F., Giuliano, C.: Linking knowledge bases to social media profiles. In: ACM SAC, pp. 145–150 (2017) Nechaev, Y., Corcoglioniti, F., Giuliano, C.: Sociallink: Linking dbpedia entities to corresponding Twitter accounts. In: The Semantic Web-ISWC 2017, pp. 165–174. Springer, Berlin (2017). https://doi.org/10.1007/978-3-319-68204-4_17 Noreen, E.W.: Computer-Intensive Methods for Testing Hypotheses. Wiley, New York (1989) Peled, O., Fire, M., Rokach, L., Elovici, Y.: Matching entities across online social networks. Neurocomputing 210, 91–106 (2016) Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014) Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 701–710 (2014) Piao, G., Breslin, J.G.: Inferring user interests for passive users on twitter by leveraging followee biographies. In: Advances in Information Retrieval 39th European Conference on IR Research, ECIR 2017, pp. 122–133 (2017) Ristoski, P., Paulheim, H.: Rdf2vec: Rdf graph embeddings for data mining. In: International Semantic Web Conference, pp. 498–514. Springer, Berlin (2016) Ristoski, P., Paulheim, H.: Semantic Web in data mining and knowledge discovery: a comprehensive survey. Web Semant. Sci. Serv. Agents World Wide Web 36, 1–22 (2016). https://doi.org/10.1016/j.websem.2016.01.001 Ristoski, P., Rosati, J., Di Noia, T., De Leone, R., Paulheim, H.: RDF2Vec: RDF graph embeddings and their applications. Semant Web (2019, to appear). http://www.semantic-web-journal.net/content/rdf2vec-rdf-graph-embeddings-and-their-applications-1 Sadilek, A., Kautz, H., Bigham, J.P.: Finding your friends and following them to where you are. In: Proceedings of 5th ACM International Conference on Web Search and Data Mining (WSDM), pp. 723–732. ACM, New York (2012). https://doi.org/10.1145/2124295.2124380 Shazeer, N., Doherty, R., Evans, C., Waterson, C.: Swivel: Improving embeddings by noticing what’s missing. CoRR (2016). arXiv:abs/1602.02215 Zafarani, R., Liu, H.: Connecting corresponding identities across communities. In: Proceedings of ICWSM. AAAI Press (2009) Zafarani, R., Liu, H.: Connecting users across social media sites: a behavioral-modeling approach. In: Proceedings of KDD, pp. 41–49. ACM (2013) Zheleva, E., Getoor, L.: To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: Proceedings of the 18th International Conference on World Wide Web (WWW), pp. 531–540. ACM, New York, NY, USA (2009). https://doi.org/10.1145/1526709.1526781