Analysis of direct citation, co-citation and bibliographic coupling in scientific topic identification

Journal of Information Science - Tập 48 Số 3 - Trang 349-373 - 2022
Rajmund Klemiński1, Przemysław Kazienko1, Tomasz Kajdanowicz1
1Wroclaw University of Science and Technology, Poland

Tóm tắt

In our study, we examine the impact of citation network structures on the ability to discern valuable research topics in Computer Science literature. We use the bibliographic information available in the DBLP database to extract candidate phrases from scientific paper abstracts. Following that, we construct citation networks based on direct citation, co-citation and bibliographic coupling relationships between the papers. The candidate research topics, in the form of keyphrases and n-grammes, are subsequently ranked and filtered by a graph-text ranking algorithm. This selection of the highest ranked potential topics is further evaluated by domain experts and through the Wikipedia knowledge base. The results obtained from these citation networks are complementary, returning valid but non-overlapping output phrases between some pairs of networks. In particular, bibliographic coupling appears to capture more unique information than either direct citation or co-citation. These findings point towards the possible added value in combining bibliographic coupling analysis with other structures. At the same time, combining direct citation and co-citation is put into question. We expect our findings to be utilised in method design for research topic identification.

Từ khóa


Tài liệu tham khảo

ACM computing classification system, 2019, https://www.acm.org/about-acm/class

10.1007/s11192-011-0347-4

10.1371/journal.pone.0039464

Jo Y, Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, 370

Shubankar K, 2011 3rd conference on data mining and optimization (DMO), 96

10.1002/asi.4630240406

10.1002/asi.5090140103

10.1016/j.joi.2016.10.006

10.1002/asi.21419

10.1002/asi.23734

10.1007/s11192-017-2302-5

10.1023/A:1009976227802

Hulth A, Proceedings of the 2003 conference on Empirical methods in natural language processing, 216

Hulth A, International conference on intelligent text processing and computational linguistics, 472

Wu YB, Proceedings of the 14th ACM international conference on information and knowledge management, 283

Lopez P, Proceedings of the 5th international workshop on semantic evaluation, 248

Zhang C, 2008, J Comput Inform Syst, 4, 1169

10.1007/11739685_66

Sarkar K, Nasipuri M, Ghose S. A new approach to keyphrase extraction using neural networks. arXiv preprint arXiv:1004.3274, 2010.

Zhang Q, Proceedings of the 2016 conference on empirical methods in natural language processing, 836

10.1016/0306-4573(88)90021-0

10.1016/j.is.2008.05.002

10.1002/9780470689646.ch1

Campos R, European conference on information retrieval, 806

10.1142/S0218213004001466

Wartena C, 2010 workshops on database and expert systems applications, 54

Wang J, Pacific-Asia conference on knowledge discovery and data mining, 857

Litvak M, Proceedings of the workshop on multi-source multilingual information extraction and summarization, 17

Palshikar GK, International conference on pattern recognition and machine intelligence, 503

Tixier A, Proceedings of the 2016 conference on empirical methods in natural language processing, 1860

10.1007/s11192-011-0560-1

10.1109/TPAMI.2007.250598

Grover A, Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 855

Mikolov T, 2013, Advances in neural information processing systems, 3111

Wang R, Australasian database conference, 257

Pan L, Proceedings of the eighth international joint conference on natural language processing, 1, 875

10.1007/s11192-017-2300-7

10.1007/s11192-017-2301-6

Caragea C, Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1435

10.1162/tacl_a_00099

Textacy DeWilde B., 2019, https://github.com/chartbeat-labs/textacy

Mihalcea R, Proceedings of the 2004 conference on empirical methods in natural language processing, 404

10.1016/S0169-7552(98)00110-X

Wan X, Proceedings of the twenty-third AAAI conference on artificial intelligence, 8, 855

AMiner citation network dataset, 2019, http://aminer.org/citation

Qureshi MA, Proceedings of the 21st ACM international conference on information and knowledge management, 2515

10.3233/WEB-150318

Li W, 2016 3rd international conference on information science and control engineering (ICISCE), 683

Yu Y, Ng V. Wikirank: improving keyphrase extraction based on background knowledge. arXiv preprint arXiv:1803.09000, 2018.

Goldsmith J. Wikipedia API for python, 2016, https://github.com/goldsmith/Wikipedia

Tang J, Proceedings of the ninth ACM international conference on web search and data mining (WSDM ’16), 467

10.1145/2872518.2890513

10.1108/00220410810844150

Paraschiv IC, 2017 21st international conference on control systems and computer science (CSCS), 566

Peroni S, The semantic web – ISWC 2018, 119