Towards a unified search: Improving PubMed retrieval with full text

Journal of Biomedical Informatics - Tập 134 - Trang 104211 - 2022
Won Kim1, Lana Yeganova1, Donald C. Comeau1, W. John Wilbur1, Zhiyong Lu1
1National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD 20894, USA

Tài liệu tham khảo

Fiorini, 2018, Best match: new relevance search for PubMed, Plos Biol., 16, 10.1371/journal.pbio.2005343 Fiorini, 2018, How user intelligence is improving PubMed, Nat. Biotechnol., 36, 937, 10.1038/nbt.4267 Cejuela, 2014, tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles, Database (Oxford), 10.1093/database/bau033 Cohen, 2010, The structural and content aspects of abstracts versus bodies of full text journal articles are different, BMC Bioinform., 11 Kim, 2015, Extending the evaluation of Genia Event task toward knowledge base construction and comparison to Gene Regulation Ontology task, BMC Bioinform., 10.1186/1471-2105-16-S10-S3 Lu, 2009, Evaluating relevance ranking strategies for MEDLINE retrieval, J. Am. Med. Inform. Assoc., 16, 10.1197/jamia.M2935 Wei, 2019, PubTator central: automated concept annotation for biomedical full text articles, Nucl. Acids Res., 47, 10.1093/nar/gkz389 Westergaard, 2018, A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts, Plos Comput. Biol., 14, 10.1371/journal.pcbi.1005962 Saleh, 2018, Performance comparison of ad-hoc retrieval models over full-text vs titles of documents Lin, 2009, Is searching full text more effective than searching abstracts?, BMC Bioinform., 10 W. Kim, L. Yeganova, D.C. Comeau, W.J. Wilbur, Z. Lu, MeSH-based dataset for measuring the relevance of text retrieval, in: Proceedings of the BioNLP 2018 Workshop, 2018. Robertson, 2009, The probabilistic relevance framework: BM25 and beyond, Found. Trends Inform. Retr., 3 W. Hersh, A. Cohen, L. Ruslen, P. Roberts, TREC 2007 Genomics Track Overview Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007), 2007. Sarrouti, 2017, A passage retrieval method based on probabilistic information retrieval model and UMLS concepts in biomedical question answering, J. Biomed. Inform., 68, 10.1016/j.jbi.2017.03.001 R. Blanco, H. Zaragoza, Finding support sentences for entities, in: SIGIR '10 Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010. A. Allot, Q. Chen, S. Kim et al., LitSense: making sense of biomedical literature at sentence level, Nucl. Acids Res. 47(Web Server Issue) (2019). E. Voorhees, The philosophy of information retrieval evaluation, in: CLEF 2001: Evaluation of Cross-Language Information Retrieval Systems, vol. 2406, 2001, pp. 355–370. Islamaj, 2009, Understanding PubMed user search behavior through log analysis, Database, 10.1093/database/bap018 Yeganova, 2021, Measuring the relative importance of full text sections for information retrieval from scientific literature Joachims, 2005, Accurately interpreting clickthrough data as implicit feedback Resnick, 1961, Relative effectiveness of document titles and abstracts for determining relevance of documents, Science, 134, 1004, 10.1126/science.134.3484.1004 Kim, 2018, PubMed Phrases, an open set of coherent phrases for searching biomedical literature, Nat. Sci. Data, 10.1038/sdata.2018.104 Comeau, 2019, PMC text mining subset in BioC: about 3 million full text articles and growing, Bioinformatics, 10.1093/bioinformatics/btz070 Kafkas, 2015, Section level search functionality in Europe PMC, J. Biomed. Semant., 10.1186/s13326-015-0003-7 Sparck Jones, 2000, A probabilistic model of information retrieval: development and comparative experiments (Part 1), Inform. Process. Manage., 36, 779, 10.1016/S0306-4573(00)00015-7 C. Manning, P. Raghavan, H. Schütze, Introduction to Information Retrieval. Cambridge University Press, Cambridge, England, 2009. M.A. Hearst, C. Plaunt, Subtopic structuring for full-length document access, in: SIGIR93: 16th International ACM/SIGIR '93 Conference on Research and Development in Information Retrieval, Pittsburgh, PA, USA, 1993. Ayer, 1954, An empirical distribution function for sampling with incomplete information, Ann. Math. Stat., 26, 641, 10.1214/aoms/1177728423 Hardle, 1991 B. Efron, R. Tibshirani, An Introduction to the Bootstrap (Chapman & Hall/CRC Monographs on Statistics and Applied Probability), 1993. Burdakov, 2009, Generalized PAV algorithm with block refinement for partially ordered monotonic regression, 23 Agichtein, 2006, Learning user interaction models for predicting web search result preferences, '06. N. Fiorini, D. Lipman, Z. Lu, Towards PubMed 2.0. eLife 2017 doi: 10.7554/eLife.28801 [published Online First: Epub Date].