Preprint citation practice in PLOS

Scientometrics - Tập 127 - Trang 6895-6912 - 2022
Marc Bertin1, Iana Atanassova2,3
1ELICO Laboratory, Université Claude Bernard Lyon 1, Villeurbanne cedex, France
2CRIT Laboratory, Université de Bourgogne Franche-Comté, Besançon, France
3Institut Universitaire de France (IUF), Paris, France

Tóm tắt

The role of preprints in the scientific production and their part in citations have been growing over the past 10 years. In this paper we study preprint citations in several different aspects: the progression of preprint citations over time, their relative frequencies in relation to the IMRaD structure of articles, their distributions over time, per preprint database and per PLOS journal. We have processed the PLOS corpus that covers 7 journals and a total of about 240,000 articles up to January 2021, and produced a dataset of 8460 preprint citation contexts that cite 12 different preprint databases. Our results show that preprint citations are found with the highest frequency in the Method section of articles, though small variations exist with respect to journals. The PLOS Computational Biology journal stands out as it contains more than three times more preprint citations than any other PLOS journal. The relative parts of the different preprint databases are also examined. While ArXiv and bioRxiv are the most frequent citation sources, bioRxiv’s disciplinary nature can be observed as it is the source of more than 70% of preprint citations in PLOS Biology, PLOS Genetics and PLOS Pathogens. We have also compared the lexical content of preprint citation contexts to the citation content to peer-reviewed publications. Finally, by performing a lexicometric analysis, we have shown that preprint citation contexts differ significantly from citation contexts of peer-reviewed publications. This confirms that authors make use of different lexical content when citing preprints compared to the rest of citations.

Tài liệu tham khảo

Abdill, R. J., & Blekhman, R. (2019). Meta-Research: Tracking the popularity and outcomes of all bioRxiv preprints. eLife, 8, e45133. https://doi.org/10.7554/eLife.45133. Añazco, D., Nicolalde, B., Espinosa, I., Camacho, J., Mushtaq, M., Gimenez, J., & Teran, E. (2021). Publication rate and citation counts for preprints released during the covid-19 pandemic: The good, the bad and the ugly. PeerJ, 9, e10927. https://doi.org/10.7717/peerj.10927. Anderson, K. R. (2020). bioRxiv: Trends and analysis of five years of preprints. Learned Publishing, 33(2), 104–109. https://doi.org/10.1002/leap.1265 Atanassova, I., & Bertin, M. (2022). Preprint citations in PLOS dataset. Zenodo. https://doi.org/10.5281/zenodo.6092101. Benzécri, J.-P. (1973). L’analyse des données: L’analyse des correspondances. Dunod. Benzécri, J.-P. (1992). In N. Y. Dekker (Ed.), Correspondence Analysis Handbook (translated from: Pratique de l’analyse des données, 1. Exposé élémentaire. Dunod. ISBN 978-2-04-015732-6). Berg, J. (2017). Preprint ecosystems. Science, 357(6358), 1331–1331. https://doi.org/10.1126/science.aaq0167 Berg, J. M., Bhalla, N., Bourne, P. E., Chalfie, M., Drubin, D. G., Fraser, J. S., & Wolberger, C. (2016). Preprints for the life sciences. Science, 352(6288), 899–901. https://doi.org/10.1126/science.aaf9133 Bertin, M., & Atanassova, I. (2014). A study of lexical distribution in citation contexts through the IMRaD standard. In Proceedings of the 1st Workshop on Bibliometric-enhanced Information Retrieval (BIR) co-located with 36th European Conference on Information Retrieval (ECIR) (Vol. 1143, pp. 5–12). http://ceur-ws.org/Vol-1143/paper1.pdf. Bertin, M., & Atanassova, I. (2021). The place of preprint citations in the IMRaD structure: a study of PLOS Journals. In Proceedings of the 18th ISSI Conference. https://www.issi-society.org/proceedings/issi_2021/Proceedings%20ISSI%202021.pdf. Bertin, M., Atanassova, I., Gingras, Y., & Larivière, V. (2016). The invariant distribution of references in scientific articles. Journal of the Association for Information Science and Technology, 67(1), 164–177. https://doi.org/10.1002/asi.23367 Bordignon, F., Ermakova, L., & Noel, M. (2021, April). Preprint abstracts in times of crisis: a comparative study with the pre-pandemic period. In I. Frommholz, P. Mayr, G. Cabanac, & S. Verberne (Eds.), Proceedings of the 11th international workshop on bibliometric-enhanced information retrieval co-located with 43rd European Conference on Information Retrieval (Vol. 2847, pp. 37–44). https://hal-enpc.archives-ouvertes.fr/hal-03187900. Cabanac, G., Oikonomidi, T., & Boutron, I. (2021). Day-to-day discovery of preprint-publication links. Scientometrics. https://doi.org/10.1007/s11192-021-03900-7. da Silva, J. A. T. (2017a). Preprints: Ethical hazard or academic liberation? Kome, 5(2), 73–80. https://doi.org/10.17646/kome.2017.26. da Silva, J. A. T. (2017b). Preprints should not be cited. Current Science, 113(6), 1026–1027. da Silva, J. A. T. (2017c). The preprint wars. AME Medical Journal, 2, 74. https://doi.org/10.21037/amj.2017.05.23 da Silva, J. A. T. (2018). The preprint debate: What are the issues? Medical Journal Armed Forces India, 74(2), 162–164. https://doi.org/10.1016/j.mjafi.2017.08.002 Desjardins-Proulx, P., White, E. P., Adamson, J. J., Ram, K., Poisot, T., & Gravel, D. (2013). The case for open preprints in biology. PLoS Biology, 11(5), e1001563. https://doi.org/10.1371/journal.pbio.1001563. Fraser, N., Brierley, L., Dey, G., Polka, J. K., Pálfy, M., Nanni, F., & Coates, J. A. (2021). Preprinting the COVID-19 pandemic. bioRxiv. https://doi.org/10.1101/2020.05.22.111294. Fraser, N., Momeni, F., Mayr, P., & Peters, I. (2020). The relationship between bioRxiv preprints, citations and altmetrics. Quantitative Science Studies, 1(2), 618–638. https://doi.org/10.1162/qss_a_00043 Fry, N. K., Marshall, H., & Mellins-Cohen, T. (2019). In praise of preprints. Access Microbiology. https://doi.org/10.1099/acmi.0.000013. Fu, D. Y., & Hughey, J. J. (2019). Meta-research: Releasing a preprint is associated with more attention and citations for the peer-reviewed article. eLife, 8, e52646. https://doi.org/10.7554/eLife.52646 Giles, J. (2003). Preprint server seeks way to halt plagiarists. Nature, 426(7), 7. https://doi.org/10.1038/426007a Hoy, M. B. (2020). Rise of the Rxivs: How preprint servers are changing the publishing process. Medical Reference Services Quarterly, 39(1), 84–89. https://doi.org/10.1080/02763869.2020.1704597. Johansson, M. A., Reich, N. G., Meyers, L. A., & Lipsitch, M. (2018). Preprints: An underutilized mechanism to accelerate outbreak science. PLoS Medicine, 15(4), 1002549. https://doi.org/10.1371/journal.pmed.1002549. Jung, Y. H., & Sun, H. (2021). Korean editors’ and researchers’ experiences with preprints and attitudes towards preprint policies. Science Editing, 8(1), 4–9. https://doi.org/10.6087/kcse.223. Kaiser, J. (2017). The preprint dilemma. Science, 357(6358), 1344–1349. https://doi.org/10.1126/science.357.6358.1344 Larivière, V., Sugimoto, C. R., Macaluso, B., Milojević, S., Cronin, B., & Thelwall, M. (2014). arXiv E-prints and the journal of record: An analysis of roles and relationships. Journal of the Association for Information Science and Technology, 65(6), 1157–1169. https://doi.org/10.1002/asi.23044 Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis. Journal of Statistical Software, 25, 1–18. https://doi.org/10.18637/jss.v025.i01. Li, X., Thelwall, M., & Kousha, K. (2015). The role of arXiv, RePEc, SSRN and PMC in formal scholarly communication. ASLIB Journal of Information Management. https://doi.org/10.1108/ajim-03-2015-0049. Majumder, M. S., & Mandl, K. D. (2020). Early in the epidemic: Impact of preprints on global discourse about COVID-19 transmissibility. The Lancet Global Health, 8(5), e627–e630. https://doi.org/10.1016/s2214-109x(20)30113-3. Nabavi Nouri, S., Cohen, Y. A., Madhavan, M. V., Slomka, P. J., Iskandrian, A. E., & Einstein, A. J. (2021). Preprint manuscripts and servers in the era of coronavirus disease 2019. Journal of Evaluation in Clinical Practice, 27(1), 16–21. https://doi.org/10.1111/jep.13498 Penfold, N. C., & Polka, J. K. (2020). Technical and social issues influencing the adoption of preprints in the life sciences. PLoS Genetics, 16(4), e1008565. https://doi.org/10.1371/journal.pgen.1008565. Pinfield, S. (2001). How do physicists use an e-print archive? Implications for institutional e-print services. D-Lib Magazine, 7, 12. https://doi.org/10.1045/december2001-pinfield. Ratinaud, P., & Déjean, S. (2009). Iramuteq: implémentation de la méthode alceste d’analyse de texte dans un logiciel libre. Modélisation appliquée aux sciences humaines et sociales MASHS, 8–9. http://www.iramuteq.org/. Ravinetto, R., Caillet, C., Zaman, M. H., Singh, J. A., Guerin, P. J., Ahmad, A., et al. (2021). Preprints in times of Covid19: The time is ripe for agreeing on terminology and good practices. BMC Medical Ethics, 22(1), 1–5. https://doi.org/10.1186/s12910-021-00667-7. Saier, T., & Färber, M. (2019, April). Bibliometric-Enhanced arXiv: A Data Set for Paper-Based and Citation-Based Tasks. In G. Cabanac, I. Frommholz, & P. Mayr (Eds.), Proceedings of the 8th International workshop on bibliometric-enhanced information retrieval (bir)co-located with the 41st European Conference on Information Retrieval (ECIR 2019) (Vol. 2345, pp. 14–26). http://ceur-ws.org/Vol-2345/paper2.pdf. Small, H. (2018). Characterizing highly cited method and non-method papers using citation contexts: The role of uncertainty. Journal of Informetrics, 12(2), 461–480. https://doi.org/10.1016/j.joi.2018.03.007