Design and implementation of Metta, a metasearch engine for biomedical literature retrieval intended for systematic reviewers

Health Information Science and Systems - Tập 2 - Trang 1-9 - 2014
Neil R Smalheiser1, Can Lin2, Lifeng Jia3, Yu Jiang2, Aaron M Cohen4, Clement Yu3, John M Davis1, Clive E Adams5, Marian S McDonagh4, Weiyi Meng2
1Department of Psychiatry and Psychiatric Institute, University of Illinois at Chicago, Chicago, USA
2Department of Computer Science, Binghamton University, Binghamton, USA
3Department of Computer Science, University of Illinois at Chicago, Chicago, USA
4Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, USA
5Division of Psychiatry, University of Nottingham, UK

Tóm tắt

Individuals and groups who write systematic reviews and meta-analyses in evidence-based medicine regularly carry out literature searches across multiple search engines linked to different bibliographic databases, and thus have an urgent need for a suitable metasearch engine to save time spent on repeated searches and to remove duplicate publications from initial consideration. Unlike general users who generally carry out searches to find a few highly relevant (or highly recent) articles, systematic reviewers seek to obtain a comprehensive set of articles on a given topic, satisfying specific criteria. This creates special requirements and challenges for metasearch engine design and implementation. We created a federated search tool that is connected to five databases: PubMed, EMBASE, CINAHL, PsycINFO, and the Cochrane Central Register of Controlled Trials. Retrieved bibliographic records were shown online; optionally, results could be de-duplicated and exported in both BibTex and XML format. The query interface was extensively modified in response to feedback from users within our team. Besides a general search track and one focused on human-related articles, we also added search tracks optimized to identify case reports and systematic reviews. Although users could modify preset search options, they were rarely if ever altered in practice. Up to several thousand retrieved records could be exported within a few minutes. De-duplication of records returned from multiple databases was carried out in a prioritized fashion that favored retaining citations returned from PubMed. Systematic reviewers are used to formulating complex queries using strategies and search tags that are specific for individual databases. Metta offers a different approach that may save substantial time but which requires modification of current search strategies and better indexing of randomized controlled trial articles. We envision Metta as one piece of a multi-tool pipeline that will assist systematic reviewers in retrieving, filtering and assessing publications. As such, Metta may find wide utility for anyone who is carrying out a comprehensive search of the biomedical literature.

Tài liệu tham khảo

Meng W, Yu C: Advanced Metasearch Engine Technology. 2010, Morgan & Claypool: San Rafael, CA Dragut EC, Meng W, Yu C: Deep Web Query Interface Understanding and Integration. 2012, Morgan & Claypool: San Rafael, CA McGowan J, Sampson M: Systematic reviews need systematic searchers. J Med Libr Assoc. 2005, 93: 74-80. Bekhuis T, Demner-Fushman D, Crowley RS: Comparative effectiveness research designs in MeSH and Emtree: an evaluation of coverage. JMLA: J Med Ass. 2013, 101: 92-100. Waffenschmidt S, Janzen T, Hausner E, Kaiser T: Simple search techniques in PubMed are potentially suitable for evaluating the completeness of systematic reviews. J Clin Epidemiol. 2013, 66: 660-665. 10.1016/j.jclinepi.2012.11.011. PubMed: PubMed. [http://www.ncbi.nlm.nih.gov/pubmed] Islamaj Dogan R, Murray GC, Névéol A, Lu Z: Understanding PubMed user search behavior through log analysis. Database (Oxford) 2009, 2009:bap018 Mosa AS, Yoo I: A study on PubMed search tag usage pattern: association rule mining of a full-day PubMed query log. BMC Med Inform Decis Mak. 2013, 13: 8-10.1186/1472-6947-13-8. Edinger T, Cohen AM: A large-scale analysis of the reasons given for excluding articles that are retrieved by literature search during systematic review. AMIA Annu Symp Proc. 2013, in press Chatterley T, Dennett L: Utilisation of search filters in systematic reviews of prognosis questions. Health Info Libr J. 2012, 29: 309-322. 10.1111/hir.12004. Cohen AM, Adam CE, Davis JM: Evidence-based medicine, the essential role of systematic reviews, and the need for automated text mining tools. Proc 1st ACM Int Symp. 2010, 376-380. doi:10.1145/1882992.1883046 Chang CH, Kayed M, Girgis MR, Shaalan KF: A survey of web information extraction systems. IEEE Trans Know Eng. 2006, 18: 1411-1428. Zhao H, Meng W, Wu Z, Raghavan V, Yu C: Fully automatic wrapper generation for search engines. Proceedings of the 14th International World Wide Web Conference. 2005, 66-75. Qi X, Yang M, Ren W, Jia J, Wang J, Han G, Fan D: Find duplicates among the PubMed, EMBASE, and cochrane library databases in systematic review. PLoS ONE. 2013, 8: e71838-10.1371/journal.pone.0071838. Elmagarmid AK, Ipeirotis PG, Verykios VS: Duplicate record detection: a survey. IEEE Trans on Know and Data Eng. 2007, 19: 1-16. Shu L, Lin C, Meng W, Han Y, Yu C, Smalheiser NR: A framework for entity resolution with efficient blocking. IEEE Inter Confe on Info Reuse and Integ (IRI). 2012, 431-440. Jiang Y, Lin C, Meng W, Yu C, Cohen AM, Smalheiser NR: Rule-based deduplication of article records from bibliographic databases. Database. 2014, 2014: bat086-10.1093/database/bat086. Robinson KA, Dickersin K: Development of a highly sensitive search strategy for the retrieval of reports of controlled trials using PubMed. Int J Epidemiol. 2002, 31: 150-153. 10.1093/ije/31.1.150. Zhang L, Ajiferuke I, Sampson M: Optimizing search strategies to identify randomized controlled trials in MEDLINE. BMC Med Res Methodol. 2006, 6: 23-10.1186/1471-2288-6-23. Glanville JM, Lefebvre C, Miles JN, Camosso-Stefinovic J: How to identify randomized controlled trials in MEDLINE: ten years on. J Med Libr Assoc. 2006, 94: 130-136. Hopewell S, Clarke M, Lefebvre C, Scherer R: Handsearching versus electronic searching to identify reports of randomized trials. Cochrane Database Syst Rev. 2007, 2: MR000001 Wieland LS, Robinson KA, Dickersin K: Understanding why evidence from randomised clinical trials may not be retrieved from Medline: comparison of indexed and non-indexed records. BMJ. 2012, 344: d7501-10.1136/bmj.d7501. Wilczynski NL, Haynes RB: Optimal search strategies for detecting clinically sound prognostic studies in EMBASE: an analytic survey. J Am Med Inform Assoc. 2005, 12: 481-485. 10.1197/jamia.M1752. Hausner E, Waffenschmidt S, Kaiser T, Simon M: Routine development of objectively derived search strategies. Syst Rev. 2012, 1: 19-10.1186/2046-4053-1-19. Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB: Towards automatic recognition of scientifically rigorous clinical research evidence. J Am Med Inform Assoc. 2009, 16: 25-31. 10.1197/jamia.M2996.