Analysis of direct citation, co-citation and bibliographic coupling in scientific topic identification
Tóm tắt
In our study, we examine the impact of citation network structures on the ability to discern valuable research topics in Computer Science literature. We use the bibliographic information available in the DBLP database to extract candidate phrases from scientific paper abstracts. Following that, we construct citation networks based on direct citation, co-citation and bibliographic coupling relationships between the papers. The candidate research topics, in the form of keyphrases and n-grammes, are subsequently ranked and filtered by a graph-text ranking algorithm. This selection of the highest ranked potential topics is further evaluated by domain experts and through the Wikipedia knowledge base. The results obtained from these citation networks are complementary, returning valid but non-overlapping output phrases between some pairs of networks. In particular, bibliographic coupling appears to capture more unique information than either direct citation or co-citation. These findings point towards the possible added value in combining bibliographic coupling analysis with other structures. At the same time, combining direct citation and co-citation is put into question. We expect our findings to be utilised in method design for research topic identification.
Từ khóa
Tài liệu tham khảo
ACM computing classification system, 2019, https://www.acm.org/about-acm/class
Jo Y, Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, 370
Shubankar K, 2011 3rd conference on data mining and optimization (DMO), 96
Hulth A, Proceedings of the 2003 conference on Empirical methods in natural language processing, 216
Hulth A, International conference on intelligent text processing and computational linguistics, 472
Wu YB, Proceedings of the 14th ACM international conference on information and knowledge management, 283
Lopez P, Proceedings of the 5th international workshop on semantic evaluation, 248
Zhang C, 2008, J Comput Inform Syst, 4, 1169
Sarkar K, Nasipuri M, Ghose S. A new approach to keyphrase extraction using neural networks. arXiv preprint arXiv:1004.3274, 2010.
Zhang Q, Proceedings of the 2016 conference on empirical methods in natural language processing, 836
Campos R, European conference on information retrieval, 806
Wartena C, 2010 workshops on database and expert systems applications, 54
Wang J, Pacific-Asia conference on knowledge discovery and data mining, 857
Litvak M, Proceedings of the workshop on multi-source multilingual information extraction and summarization, 17
Palshikar GK, International conference on pattern recognition and machine intelligence, 503
Tixier A, Proceedings of the 2016 conference on empirical methods in natural language processing, 1860
Grover A, Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 855
Mikolov T, 2013, Advances in neural information processing systems, 3111
Wang R, Australasian database conference, 257
Pan L, Proceedings of the eighth international joint conference on natural language processing, 1, 875
Caragea C, Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1435
Textacy DeWilde B., 2019, https://github.com/chartbeat-labs/textacy
Mihalcea R, Proceedings of the 2004 conference on empirical methods in natural language processing, 404
Wan X, Proceedings of the twenty-third AAAI conference on artificial intelligence, 8, 855
AMiner citation network dataset, 2019, http://aminer.org/citation
Qureshi MA, Proceedings of the 21st ACM international conference on information and knowledge management, 2515
Li W, 2016 3rd international conference on information science and control engineering (ICISCE), 683
Yu Y, Ng V. Wikirank: improving keyphrase extraction based on background knowledge. arXiv preprint arXiv:1803.09000, 2018.
Goldsmith J. Wikipedia API for python, 2016, https://github.com/goldsmith/Wikipedia
Tang J, Proceedings of the ninth ACM international conference on web search and data mining (WSDM ’16), 467
Paraschiv IC, 2017 21st international conference on control systems and computer science (CSCS), 566
Peroni S, The semantic web – ISWC 2018, 119