The use of citation context to detect the evolution of research topics: a large-scale analysis

Scientometrics - Tập 126 - Trang 2971-2989 - 2021
Chaker Jebari1, Enrique Herrera-Viedma2, Manuel Jesus Cobo3
1Department of Information Technology, University of Technology and Applied Sciences, Ibri, Sultanate of Oman
2Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada, Spain
3Department of Computer Science and Engineering, University of Cádiz, Cádiz, Spain

Tóm tắt

With the exponential increase in the number of published papers, discovering how topics evolve becomes increasingly important for anybody involved in research, including researchers, institutes, research funding bodies, and decision-makers. This study proposes a large-scale analysis of the evolution of biomedical and life sciences using the citation contexts of the collected papers, or more precisely their citing sentences. Using 64,350 papers published in PubMed Central between 2008 and 2018, we determined the research trends for ten research topics. Moreover, we studied how these topics evolve across countries and across the most common journals in biomedical and life sciences.

Tài liệu tham khảo

Abu-Jbara, A. and Ezra, J. and Radev, D. (2013). Purpose and polarity of citation: Towards NLP-based bibliometrics, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 596–606. Abu-Jbara, A. and Radev, D. R. (2012). Reference scope identification in citing sentences, Proceedings of the conference of the North American chapter of the association for computational linguistics: human language technologies, Montreal, Canada, pp. 80–90. Aljaber, B., Stokes, N., Bailey, J., & Pei, J. (2010). Document clustering of scientific texts using citation contexts. Information Retrieval, 13, 101–131. Alvarez, M. H., & Gómez, J. M. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22, 327–349. Athar, A. (2011). Sentiment analysis of citations using sentence structure-based features. In Proceedings of ACL conference (student session) (pp. 81–87). Athar, A. (2014). Sentiment analysis of scientific citations, Technical Report, University of Cambridge, Computer Laboratory, (UCAM-CL-TR-856). Athar, A., & Teufel, S. (2012). Context-enhanced citation sentiment detection. In Proceedings of HLT-NAACL, 597–601, Bengisu, M. (2003). Critical and emerging technologies in materials, manufacturing, and industrial engineering: A study for priority setting. Scientometrics, 58, 473–487. Blei, D. M. and Lafferty, J. (2006). Dynamic topic models, Proceedings of the 23rd International Conference on Machine Learning (ICML), 113–120. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022. Bornmann, L., & Mutz, R. (2015). Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references. JASIST, 66, 2215–2222. Bu, Y., Wang, B., Huang, W. B., Che, S., & Huang, Y. (2018). Using the appearance of citations in full text on author co-citation analysis. Scientometrics, 116, 275–289. Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22, 191–235. Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for information Science and Technology, 57, 359–377. Chen, X., Chen, J., Wu, D., Xie, Y., & Li, J. (2016). Mapping the research trends by co-word analysis based on keywords from funded project. Procedia Computer Science, 91, 547–555. Chen, S. H., Huang, M. H., Chen, D. Z., & Lin, S. G. (2012). Detecting the temporal gaps of technology fronts: A case study of smart grid field. Technological Forecasting and Social Change, 79, 1705–1719. Chen, B., Tsutsui, S., Ding, Y., & Ma, F. (2017). Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval. Informetrics, 11, 1175–1189. Cobo, M. J., Chiclana, F., Collop, A., Oña, J., & Herrera-Viedma, E. (2014). A bibliometric analysis of the intelligent transportation systems research based on science mapping. IEEE Trans. Intelligent Transportation Systems, 15, 901–908. Cobo, M. J., Martínez, M. A., Gutiérrez-Salcedo, M., Fujita, H., & Herrera-Viedma, E. (2015). 25 years at knowledge-based systems: A bibliometric analysis. Knowledge-Based Systems, 80, 3–13. Dehdarirad, T., Villarroya, A., & Barrios, M. (2014). Research trends in gender differences in higher education and science: A co-word analysis. Scientometrics, 101, 273–290. Garfield, E. (1963). Science citation index. Science Citation Index, 1. Garfield, E. (1962). Can citation indexing be automated. Essays of an Information Scientist, 1, 84–90. Garfield, E. (1972). Citation analysis as a tool in journal evaluation: Journals can be ranked by frequency and impact of citations for science policy studies. Science, 178, 471–479. Glänzel, W., & Thijs, B. (2012). Using ’core documents’ for detecting and labelling new emerging topics. Scientometrics, 91, 399–416. Gordon, M. D., & Dumais, S. (1998). Using latent semantic indexing for literature based discovery. Journal of the American Society for Information Science, 49, 674–685. Grifiths, T.L. & Steyvers, M. (2004). Finding scientific topics. In Proceedings of national academy of sciences 101 (Suppl. 1), USA, (pp. 5228–5235). Guo, H., Weingart, S., & Börner, K. (2011). Mixed-indicators model for identifying emerging research areas. Scientometrics, 89, 421–435. He, J., & Chen, C. (2018). Temporal representations of citations for understanding the changing roles of scientific publications. Frontiers in Research Metrics and Analytics, 3. He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, C. L. (2010). Context-aware citation recommendation. Proceedings of WWW Conference, 421–430, Hu, C. P., Hu, J. M., Deng, S., & Liu, Y. (2013). A co-word analysis of library and information science in China. Scientometrics, 97, 369–382. Hui, S. C., & Fong, A. C. M. (2004). Document retrieval from a citation database using conceptual clustering and co-word analysis. Information Review, 28, 22–32. Hu, J., & Zhang, Y. (2015). Research patterns and trends of Recommendation System in China using co-word analysis. Information Processing Management, 51, 329–339. Jebari, C., Cobo, M. J., & Herrera-Viedma, E. (2018). A new approach for implicit citation extraction, proceedings of IDEAL conference (pp. 121–129). Spain: Madrid. Jurgens, D., Kumar, S., Hoover, R., McFarland, D., & Jurafsky, D. (2018). Measuring the Evolution of a Scientific Field through Citation Frames. Transactions of the Association for Computational Linguistics, 6, 391–406. Kajikawa, Y., & Takeda, Y. (2008). Structure of research on biomass and bio-fuels: A citation-based approach. Technological Forecasting and Social Change, 75, 1349–1359. Kim, H., Jiang, X., & Ohno-Machado, L. (2011). Trends in biomedical informatics: most cited topics from recent years. JAMIA, 18, 166–170. Kostoff, R. N. (2001). Text mining using database tomography and bibliometrics: A review. Technological Forecasting and Social Change, 68, 223–253. Kostoff, R. N., del Rio, J. A., Humenik, J. A., Garcia, E. O., & Ramirez, A. M. (2001). Citation mining: Integrating text mining and bibliometrics for research user profiling. Journal American Society Information Sciences Technology, 52, 1148–1156. Larsen, P. O., & von Ins, M. (2010). The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84, 575–603. Lee, B., & Jeong, Y. (2008). Mapping Korea’s national R&D domain of robot technology by using the co-word analysis. Scientometrics, 77, 3–19. Li, L., Ding, G., Feng, N., Wang, M., & Ho, Y. (2009). Global stem cell research trend: Bibliometric analysis as a tool for mapping of trends from 1991 to 2006. Scientometrics, 80, 39–58. Liu, S., Chen, C., Ding, K., Wang, B., Xu, K., & Li, Y. (2014). Literature retrieval based on citation context. Scientometrics, 101, 1293–1307. López-Robles, J. R., Otegi-Olaso, J. R., Gómez, I. P., & Cobo, M. J. (2019). 30 years of intelligence models in management and business: A bibliometric review. International Journal of Information Management, 48, 22–38. MacDonald, K. I., & Dressler, V. (2018). Using citation analysis to identify research fronts: A case study with the internet of things. Science and Technology Libraries, 37, 171–186. Ma, S., Zhang, C., & Liu, X. (2020). A review of citation recommendation: from textual content to enriched context. Scientometrics, 122, 1445–1472. Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., et al. (2009). (pp. 584–592) USA: Moral-Munoz, J. A., Arroyo-Morales, M., Piper, B. F., Cuesta-Vargas, A. I., Díaz-Rodríguez, L., Cho, W. C. S., et al. (2018). Thematic trends in complementary and alternative medicine applied in cancer-related symptoms. Journal Data Information Science, 3, 1–19. Morris, S. A., Yen, G., Wu, Z., & Asnake, B. (2003). Time line visualization of research fronts. Journal of the American Society for Information Science and Technology, 54, 413–422. Muñoz-Leiva, F., Viedma-del-Jesús, M. I., Sánchez-Fernández, J., & López-Herrera, A. G. (2012). An application of co-word analysis and bibliometric maps for detecting the most highlighting themes in the consumer behaviour research from a longitudinal perspective. Quality & Quantity, 46, 1077–1095. Murgado Armenteros, E. M., Gutiérrez Salcedo, M., Torres Ruiz, F. J., & Cobo, M. J. (2015). Analysing the conceptual evolution of qualitative marketing research through science mapping analysis. Scientometrics, 102, 519–557. Ohniwa, R., Hibino, A., & Takeyasu, K. (2010). Trends in research foci in life science fields over the last 30 years monitored by emerging topics. Scientometrics, 85, 111–127. Perez-Cabezas, V., Ruiz-Molinero, C., Carmona-Barrientos, I., Herrera-Viedma, E., Cobo, M. J., & Moral-Munoz, J. A. (2018). Highly cited papers in rheumatology: Identification and conceptual analysis. Scientometrics, 116, 555–568. Qazvinian, V. & Radev, D. R. (2010). Identifying non-explicit citing sentences for citation based summarization. In Proceedings of the 48th annual meeting ACL. Uppsala, Sweden, pp. 555–564. Reiss, T., Vignola-Gagne, E., Kukk, P., Glänzel, W., & Thijs, B. (2013). ERACEP- Emerging research topics and their coverage by ERC-supported projects. European Research Council: Technical Report. Ritchie, A. (2009). Citation context analysis for information retrieval. UK: University of Cambridge. Sagar, A., Kademani, B. S., & Bhanumurthy, K. (2013). Research trends in agricultural science: A global perspective. Journal of Scientometric Research, 2, 185–201. Schwartz, A. S. and Hearst, M. (2006). Summarizing key concepts using citation sentences, Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis, ser. BioNLP ’06. Stroudsburg, PA, USA: Association for Computational Linguistics, 134–135. Shibata, N., Kajikawa, Y., Takeda, Y., Sakata, I., & Matsushima, K. (2011). Detecting emerging research fronts in regenerative medicine by the citation network analysis of scientific publications. Technological Forecasting and Social Change, 78, 274–282. Smalheiser, N. R. (2001). Predicting emerging technologies with the aid of text-based data mining: the micro approach. Technovation, 21, 689–693. Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24, 265–269. Small, H. (2006). Tracking and predicting growth areas in science. Scientometrics, 68, 595–610. Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics, 87, 373–388. Small, H., Boyack, K. W., & Klavans, R. (2014). Identifying emerging topics in science and technology. Research Policy, 43, 1450–1467. Small, H., Tseng, H., & Patek, M. (2017). Discovering discoveries: Identifying biomedical discoveries using citation contexts. Journal of Informetrics, 11, 46–62. Sugiyama, K., Kumar, T., Kan, M. Y., & Tripathi, R. C. (2010). Identifying citing sentences in research papers using supervised learning, Proceedings of International Conference on Information Retrieval and Knowledge Management (CAMP), Shah Alam (pp. 67–72). Malaysia: Selangor. Sun, L., & Yin, Y. (2017). Discovering themes and trends in transportation research using topic modeling. Transportation Research Part C: Emerging Technologies, 77, 49–66. Teufel, S. and Siddharthan, A. and Tidhar, D. (2006). An annotation scheme for citation function, Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, 80–87. Upham, S., & Small, H. (2010). Emerging research fronts in science and technology: patterns of new knowledge development. Scientometrics, 83, 15–38. Wang, Z. Y., Li, G., Li, C. Y., & Li, A. (2012). Research on the semantic-based co-word analysis. Scientometrics, 90, 855–875. Yan, E., Chen, Z., & Li, K. (2020). The relationship between journal citation impact and citation sentiment: A study of 32 million citances in PubMed Central. Quantitative Science Studies, 1, 1–11. Yu, D., Xu, Z., & Wang, W. (2018). Bibliometric analysis of fuzzy theory research in China: A 30-year perspective. Knowledge-Based Systems, 141, 188–199. Zhang, Y., Chen, H., Lu, J., & Zhang, G. (2017). Detecting and predicting the topic change of Knowledge-based Systems: A topic-based bibliometric analysis from 1991 to 2016. Knowledge-Based Systems, 133, 255–268. Zhang, G., Ding, Y., & Milojevic, S. (2013). Citation content analysis (cca): A framework for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology, 64, 1490–1503. Zitt, M., Ramanana-Rahary, S., & Bassecoulard, E. (2005). Relativity of citation performance and excellence measures: From cross-field to cross-scale effects of field-normalisation. Scientometrics, 63, 373–401.