Using semantic graphs to detect overlapping target events and story lines from newspaper articles
Tóm tắt
Event detection from text data is an active area of research. While the emphasis in the literature has been on event identification and labeling using a single data source, this work considers event and story line detection when using a large number of data sources. In this setting, it is natural for different events in the same domain, e.g., violence, sports, politics, to occur at the same time and for different story lines about the same event to emerge. To capture events in this setting, we propose an Offline algorithm that detects events and story lines about events for a target domain given a news article collection. Our algorithm leverages a multi-relational sentence-level semantic graph and well-known graph properties to identify overlapping events and story lines within the events. We then extend this algorithm for an Online setting. Both the Offline and Online approaches are evaluated using two large data sets containing millions of news articles from a large number of sources. Our empirical analysis shows that methods using the proposed semantic graph beat the state of the art in terms of precision and recall while providing more complete event summaries.
Tài liệu tham khảo
Inma. http://www.inma.org/article/index.cfm/23899-credibility-of-online-newspapers
Nbcnews. http://www.nbcnews.com/technology/online-news-readership-overtakes-newspapers-124383
Statoids. http://www.statoids.com
Abdelhaq, H., Sengstock, C., Gertz, M.: Eventweet: online localized event detection from twitter. VLDB Endow. 6(12), 1326–1329 (2013)
Aggarwal, C.C., Subbian, K.: Event detection in social streams. In: SDM, vol. 12, pp. 624–635. SIAM (2012)
Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: SIGIR, pp. 37–45. ACM (1998)
Antiqueira, L., Oliveira, O.N., da Fontoura Costa, L., Nunes, M.D.: A complex network approach to text summarization. Inf. Sci. 179(5), 584–599 (2009)
Barzilay, R., McKeown, K.R., Elhadad, M.: Information fusion in the context of multi-document summarization. In: ACL, pp. 550–557. ACL (1999)
Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on twitter. ICWSM 11, 438–441 (2011)
Brants, T., Chen, F., Farahat, A.: A system for new event detection. In: SIGIR, pp. 330–337. ACM (2003)
Chakrabarti, D., Punera, K.: Event summarization using tweets. ICWSM 11, 66–73 (2011)
Chen, F., Neill, D.B. : Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs. In: KDD, pp. 1166–1175. ACM (2014)
Chua, F.C.T., Asur, S.: Automatic summarization of events from social media. In ICWSM, Citeseer (2013)
Ferlez, J., Faloutsos, C., Leskovec, J., Mladenic, D., Grobelnik, M.: Monitoring network evolution using mdl. In: ICDE, pp. 1328–1330. IEEE (2008)
Fung, G.P.C., Yu, J.X., Yu, P.S., Lu, H.: Parameter free bursty events detection in text streams. In: VLDB, pp. 181–192. VLDB Endowment (2005)
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99(12), 7821–7826 (2002)
Guralnik, V., Srivastava, J.: Event detection from time series data. In: KDD, pp. 33–42. ACM (1999)
Lappas, T., Arai, B., Platakis, M., Kotsakos, D., Gunopulos, D.: On burstiness-aware search for document sequences. In: KDD, pp. 477–486. ACM (2009)
Lappas, T., Vieira, M.R., Gunopulos, D., Tsotras, V.J.: On the spatiotemporal burstiness of terms. In: VLDB, pp. 836–847 (2012)
Lei, K., Khadiwala, R., Chang, K.-C.T.: A twitter-based event detection and analysis system. In: ICDE (2012)
Leskovec, J., Backstrom, L., Kleinberg, J.: Meme-tracking and the dynamics of the news cycle. In: KDD, pp. 497–506. ACM (2009)
Li, C., Sun, A., Datta, A.: Twevent: segment-based event detection from tweets. In: CIKM, pp. 155–164. ACM (2012)
Lin, C.X., Zhao, B., Mei, Q., Han, J.: Pet: a statistical model for popular events tracking in social communities. In: KDD, pp. 929–938. ACM (2010)
Mihalcea, R., Tarau, P.: A language independent algorithm for single and multiple document summarization. In: ICJNLP (2005)
Muthiah, S., Huang, B., Arredondo, J., Mares, D., Getoor, L., Katz, G., Ramakrishnan, N.: Planned protest modeling in news and social media. In: AAAI, pp. 3920–3927 (2015)
Nichols, J., Mahmud, J., Drews, C.: Summarizing sporting events using twitter. In: IUI, pp. 189–198. ACM (2012)
Nishihara, Y., Sato, K., Sunayama, W.: Event extraction and visualization for obtaining personal experiences from blogs. In: Human Interface and the Management of Information. Information and Interaction, pp. 315–324. Springer (2009)
Ramakrishnan, N., Butler, P., Muthiah, S., Self, N., Khandpur, R., Saraf, P., Wang, W., Cadena, J., Vullikanti, A., Korkmaz, G., et al.: ‘Beating the news’ with embers: forecasting civil unrest using open source indicators. In KDD, pp. 1799–1808. ACM (2014)
Ritter, A., Etzioni, O., Clark, S., et al.: Open domain event extraction from twitter. In: KDD, pp. 1104–1112. ACM (2012)
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: WWW, pp. 851–860. ACM (2010)
Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: ICWSM (2009)
Shen, D., Sun, J.-T., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. IJCAI 7, 2862–2867 (2007)
Wang, D., Ding, W.: A hierarchical pattern learning framework for forecasting extreme weather events. In: ICDM, pp. 1021–1026. IEEE (2015)
Wang, J., Tong, W., Yu, H., Li, M., Ma, X., Cai, H., Hanratty, T., Han, J.: Mining multi-aspect reflection of news events in twitter: Discovery, linking and presentation. In: ICDM, pp. 429–438. IEEE (2015)
Wang, X., Zhai, C., Hu, X., Sproat, R.: Mining correlated bursty topic patterns from coordinated text streams. In: KDD, pp. 784–793. ACM (2007)
Wei, Y., Singh, L., Gallagher, B., Buttler, D.: Overlapping target event and story line detection of online newspaper articles. In: DSAA
Weng, J., Lee, B.-S.: Event detection in twitter. ICWSM 11, 401–408 (2011)
Xie, W., Zhu, F., Jiang, J., Lim, E.-P., Wang, K.: Topicsketch: Real-time bursty topic detection from twitter. In: ICDM, pp. 837–846. IEEE (2013)
Xu, F., Uszkoreit, H., Li, H.: Automatic event and relation detection with seeds of varying complexity. In: AAAI Workshop Event Extraction and Synthesis, pp. 12–17 (2006)
Yang, Y., Pierce, T., Carbonell, J.: A study of retrospective and on-line event detection. In: SIGIR, pp. 28–36. ACM (1998)