Web of Science use in published research and review papers 1997–2017: a selective, dynamic, cross-domain, content-based analysis

Scientometrics - Tập 115 - Trang 1-20 - 2017
Kai Li1, Jason Rollins2, Erjia Yan1
1Drexel University, Philadelphia, USA
2Clarivate Analytics, San Francisco, USA

Tóm tắt

Clarivate Analytics’s Web of Science (WoS) is the world’s leading scientific citation search and analytical information platform. It is used as both a research tool supporting a broad array of scientific tasks across diverse knowledge domains as well as a dataset for large-scale data-intensive studies. WoS has been used in thousands of published academic studies over the past 20 years. It is also the most enduring commercial legacy of Eugene Garfield. Despite the central position WoS holds in contemporary research, the quantitative impact of WoS has not been previously examined by rigorous scientific studies. To better understand how this key piece of Eugene Garfield’s heritage has contributed to science, we investigated the ways in which WoS (and associated products and features) is mentioned in a sample of 19,478 English-language research and review papers published between 1997 and 2017, as indexed in WoS databases. We offered descriptive analyses of the distribution of the papers across countries, institutions and knowledge domains. We also used natural language processingtechniques to identify the verbs and nouns in the abstracts of these papers that are grammatically connected to WoS-related phrases. This is the first study to empirically investigate the documentation of the use of the WoS platform in published academic papers in both scientometric and linguistic terms.

Tài liệu tham khảo

Adair, W. C. (1955). Citation Indexes for Scientific Literature? American Documentation (Pre-1986); Washington, 6(1), 31. Amin, M., & Mabe, M. (2004). Impact factors: Use and abuse. International Journal of Environmental Science and Technology (IJEST), 1(1), 1. Apai, D., Lagerstrom, J., Reid, I. N., Levay, K. L., Fraser, E., Nota, A., et al. (2010). Lessons from a high-impact observatory: The Hubble Space Telescope’s science productivity between 1998 and 2008. Publications of the Astronomical Society of the Pacific, 122(893), 808. Arnold, T., & Tilton, L. (2016). coreNLP: Wrappers around Stanford CoreNLP tools. Computer Software Manual] (R Package Version 0.4-2). Retrieved from https://CRAN.R-Project.Org/Package=CoreNLP Belter, C. W. (2014). Measuring the value of research data: A citation analysis of oceanographic data sets. PLoS ONE, 9(3), e92590. Bornmann, L., Haunschild, R., & Leydesdorff, L. (2017). Reference Publication Year Spectroscopy (RPYS) of Eugene Garfield’s publications. Retrieved from http://arxiv.org/abs/1708.04442 [Cs]. Broadus, R. (1987). Toward a definition of “bibliometrics”. Scientometrics, 12(5–6), 373–379. Cameron, B. D. (2005). Trends in the usage of ISI bibliometric data: Uses, abuses, and implications. Portal: Libraries and the Academy, 5(1), 105–125. Carroll, J., Minnen, G., & Briscoe, T. (1999). Corpus annotation for parser evaluation. ArXiv preprint arXiv:cs/9907013. Cawkell, T., & Garfield, E. (2001). Institute for Scientific Information. Information Services and Use, 21(2), 79–86. Chao, T. C. (2011). Disciplinary reach: Investigating the impact of dataset reuse in the earth sciences. Proceedings of the American Society for Information Science and Technology, 48(1), 1–8. https://doi.org/10.1002/meet.2011.14504801125. Chen, C. (2017). Eugene Garfield’s scholarly impact: A scientometric review. ArXiv preprint arXiv:1710.01895. Clarivate Analytics. (2017). Web of Science product webpage. Retrieved from https://clarivate.com/products/web-of-science/ Coelho, P. M. Z., Antunes, C. M. F., Costa, H. M. A., Kroon, E. G., Lima, S., & Linardi, P. M. (2003). The use and misuse of the” impact factor” as a parameter for evaluation of scientific publication quality: A proposal to rationalize its application. Brazilian Journal of Medical and Biological Research, 36(12), 1605–1612. da Silva, J. A. T., & Bernès, S. (2017). Clarivate Analytics: Continued omnia vanitas impact factor culture. Science and Engineering Ethics. https://doi.org/10.1007/s11948-017-9873-7. Demarest, B., & Sugimoto, C. R. (2015). Argue, observe, assess: Measuring disciplinary identities and differences through socio-epistemic discourse. Journal of the Association for Information Science and Technology, 66(7), 1374–1387. https://doi.org/10.1002/asi.23271. Dorch, S. B. F. (2012). On the citation advantage of linking to data: Astrophysics. H-Prints and Humanities. Retrieved from https://hal-hprints.archives-ouvertes.fr/hprints-00714715/document/ Garfield, E. (1955). Citation indexes for science: A new dimension in documentation through association of ideas. Science, 122(3159), 108–111. https://doi.org/10.1126/science.122.3159.108. Garfield, E. (1972). Citation Analysis as a Tool in Journal Evaluation: Journals can be ranked by frequency and impact of citations for science policy studies. Science, (178), 471–479. http://www.elshami.com/Terms/I/impact%20factor-Garfield.pdf. Garfield, E. (1977). SCI Journal citation reports: A bibliometric analysis of science journals in the ISI data base. Philadelphia: Institute for Scientific Information. Garfield, E. (1996). When to cite. The Library Quarterly, 66, 449–458. Garfield, E. (2007). The evolution of the Science Citation Index. International Microbiology: Official Journal of the Spanish Society for Microbiology, 10(1), 65–70. Gleditsch, N. P., Metelits, C., & Strand, H. (2003). Posting your data: Will you be scooped or will you be famous. International Studies Perspectives, 4(1), 89–97. Hansson, S. (1995). Impact factor as a misleading tool in evaluation of medical journals. The Lancet, 346(8979), 906. He, L., & Han, Z. (2017). Do usage counts of scientific data make sense? An investigation of the Dryad repository. Library Hi Tech, 35(2), 332–342. https://doi.org/10.1108/LHT-12-2016-0158. He, L., & Nahar, V. (2016). Reuse of scientific data in academic publications: An investigation of Dryad Digital Repository. Aslib Journal of Information Management, 68(4), 478–494. Henneken, E. A., & Accomazzi, A. (2011). Linking to data—Effect on citation rates in astronomy. Retrieved from http://arxiv.org/abs/1111.3618 [Astro-Ph]. Hood, W., & Wilson, C. (2001). The literature of bibliometrics, scientometrics, and informetrics. Scientometrics, 52(2), 291–314. Ioannidis, J. P. A., Allison, D. B., Ball, C. A., Coulibaly, I., Cui, X., Culhane, A. C., et al. (2009). Repeatability of published microarray gene expression analyses. Nature Genetics, 41(2), 149–155. https://doi.org/10.1038/ng.295. Jelercic, S., Lingard, H., Spiegel, W., Pichlhöfer, O., & Maier, M. (2010). Assessment of publication output in the field of general practice and family medicine and by general practitioners and general practice institutions. Family Practice, 27(5), 582–589. Klein, D. B., & Chiang, E. (2004). The Social Science Citation Index: A Black Box—with an ideological Bias? Econ Journal Watch; Fairfax, 1(1), 134. Klein, D., & Manning, C. D. (2003). Accurate unlexicalized parsing. In Proceedings of the 41st annual meeting on association for computational linguistics-volume 1 (pp. 423–430). Association for Computational Linguistics. Kumar, V., Upadhyay, S., & Medhi, B. (2009). Impact of the impact factor in biomedical research: Its use and misuse. Singapore Medical Journal, 50(8), 752–755. Lazerow, S. (1974). Institute for scientific information. In A. Kent et al. (Eds.), Encyclopedia of library and information science (pp. 89–97). New York: Marcel Dekker. Leng, Z., He, X., Li, H., Wang, D., & Cao, K. (2013). Olfactory ensheathing cell transplantation for spinal cord injury: An 18-year bibliometric analysis based on the Web of Science. Neural Regeneration Research, 8(14), 1286–1296. https://doi.org/10.3969/j.issn.1673-5374.2013.14.005. Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., & McClosky, D. (2014). The stanford corenlp natural language processing toolkit. In ACL (system demonstrations) (pp. 55–60). Mayo, C., Vision, T. J., & Hull, E. A. (2016). The location of the citation: Changing practices in how publications cite original data in the Dryad Digital Repository. International Journal of Digital Curation, 11(1), 150–155. Meho, L. I. (2007). The rise and rise of citation analysis. Physics World, 20(1), 32. Nalimov, V. V., & Mulchenko, Z. M. (1969). Scientometrics. Nauka: The study of science as an information process. Nivre, J., de Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajic, J., Manning, C. D., et al. (2016). Universal dependencies v1: A multilingual treebank collection. In Proceedings of LREC. Orosz, K., Farkas, I. J., & Pollner, P. (2016). Quantifying the changing role of past publications. Scientometrics, 108(2), 829–853. Pan, X., Yan, E., Wang, Q., & Hua, W. (2015). Assessing the impact of software on science: A bootstrapped learning of software entities in full-text papers. Journal of Informetrics, 9(4), 860–871. Pendlebury, D. (1993). Nobel-prize honor basic research and development of tools that drive IT-rivals share laurels for medicine, while work on pulsars and gravitation earns the big award in physics. Scientist, 7(23), 1. Peters, I., Kraker, P., Lex, E., Gumpenberger, C., & Gorraiz, J. (2015). Research data explored: Citations versus altmetrics. Retrieved from http://arxiv.org/abs/1501.03342 [Cs]. Peters, I., Kraker, P., Lex, E., Gumpenberger, C., & Gorraiz, J. (2016). Research data explored: An extended analysis of citations and altmetrics. Scientometrics, 107, 723–744. https://doi.org/10.1007/s11192-016-1887-4. Pienta, A. M., Alter, G. C., & Lyle, J. A. (2010). The enduring value of social science research: The use and reuse of primary research data. Retrieved from https://deepblue.lib.umich.edu/handle/2027.42/78307 Piqueras, J. A., Martín-Vivar, M., Sandin, B., San Luis, C., & Pineda, D. (2017). The Revised Child Anxiety and Depression Scale: A systematic review and reliability generalization meta-analysis. Journal of Affective Disorders, 218(Supplement C), 153–169. https://doi.org/10.1016/j.jad.2017.04.022. Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing detailed research data is associated with increased citation rate. PLoS ONE, 2(3), e308. https://doi.org/10.1371/journal.pone.0000308. Piwowar, H. A., & Vision, T. J. (2013). Data reuse and the open data citation advantage. PeerJ, 1, e175. Ponomarev, I. V., Williams, D. E., Hackett, C. J., Schnell, J. D., & Haak, L. L. (2014). Predicting highly cited papers: A method for early detection of candidate breakthroughs. Technological Forecasting and Social Change, 81, 49–55. Pringle, J. (2008). Trends in the use of ISI citation databases for evaluation. Learned Publishing, 21(2), 85–91. R Core Team. (2016). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Retrieved from https://www.R-project.org/ Rumsey, E. (2010). Eugene Garfield: librarian and grandfather of Google. Retrieved from https://blog.lib.uiowa.edu/hardinmd/2010/07/12/eugene-garfield-librarian-grandfather-of-google/. Salager-Meyer, F. (1990). Discoursal flaws in medical English abstracts: A genre analysis per research-and text-type. Text-Interdisciplinary Journal for the Study of Discourse, 10(4), 365–384. Salager-Meyer, F. (1992). A text-type and move analysis study of verb tense and modality distribution in medical English abstracts. English for Specific Purposes, 11(2), 93–113. Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introductions in two disciplines. English for Specific Purposes, 24(2), 141–156. Seglen, P. O. (1997). Citations and journal impact factors: questionable indicators of research quality. Allergy, 52(11), 1050–1056. Sengupta, I. N. (1992). Bibliometrics, informetrics, scientometrics and librametrics: An overview. Libri, 42(2), 75. Shuai, X., Rollins, J., Moulinier, I., Custis, T., Edmunds, M., & Schilder, F. (2017). A multidimensional investigation of the effects of publication retraction on scholarly impact. Journal of the Association for Information Science and Technology, 68(9), 2225–2236. https://doi.org/10.1002/asi.23826. Simons, K. (2008). The misused impact factor. Science, 322(5899), 165. Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269. https://doi.org/10.1002/asi.4630240406. Small, H. (1982). Citation context analysis. Progress in Communication Sciences, 3, 287–310. Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics, 87(2), 373–388. Swales, J. M. (1981). Aspects of article introductions. Birmingham: Language Studies Unit, University of Aston in Birmingham. Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 103–110). Association for Computational Linguistics. Retrieved from http://dl.acm.org/citation.cfm?id=1610091 Van Raan, A. (1997). Scientometrics: State-of-the-art. Scientometrics, 38(1), 205–218. van Raan, A. F. J., & Wouters, P. (2017). Eugene Garfield 1925–2017: Visionary information scientist. Retrieved April 26, 2017. Wang, Y., Xiong, J., Niu, M., Chen, X., Gao, L., Wu, Q., et al. (2017). Statins and the risk of cirrhosis in hepatitis B or C patients: A systematic review and dose-response meta-analysis of observational studies. Oncotarget, 8(35), 59666. Yan, E. (2014). Finding knowledge paths among scientific disciplines. Journal of the Association for Information Science and Technology, 65(11), 2331–2347. Yan, J., Li, X., Peng, L., Shen, X., Dang, Y., & Zhang, G. (2017). MicroRNA-150 as a potential biomarker in diagnosis of cancer: A meta-analysis. Clinical Laboratory, 63(7), 1187. Zhang, Z., Rollins, J., & Lipitakis, L. (2017). The evolution of China’s role in the International Scientific Collaboration Network. In Proceedings of ISSI 2017—16th International Conference On Scientometrics & Informetrics, Wuhan, China (pp. 1052–1063). Zhao, M., Yan, E., & Li, K. (n.d.). Data set mentions and citations: A content analysis of full-text publications. Journal of the Association for Information Science and Technology. https://doi.org/10.1002/asi.23919 Zupic, I., & Čater, T. (2015). Bibliometric methods in management and organization. Organizational Research Methods, 18(3), 429–472.