Evaluation of Doc’EDS: a French semantic search tool to query health documents from a clinical data warehouse
Tóm tắt
Unstructured data from electronic health records represent a wealth of information. Doc’EDS is a pre-screening tool based on textual and semantic analysis. The Doc’EDS system provides a graphic user interface to search documents in French. The aim of this study was to present the Doc’EDS tool and to provide a formal evaluation of its semantic features. Doc’EDS is a search tool built on top of the clinical data warehouse developed at Rouen University Hospital. This tool is a multilevel search engine combining structured and unstructured data. It also provides basic analytical features and semantic utilities. A formal evaluation was conducted to measure the impact of Natural Language Processing algorithms. Approximately 18.1 million narrative documents are stored in Doc’EDS. The formal evaluation was conducted in 5000 clinical concepts that were manually collected. The F-measures of negative concepts and hypothetical concepts were respectively 0.89 and 0.57. In this formal evaluation, we have shown that Doc’EDS is able to deal with language subtleties to enhance an advanced full text search in French health documents. The Doc’EDS tool is currently used on a daily basis to help researchers to identify patient cohorts thanks to unstructured data.
Tài liệu tham khảo
Raghavan P, Chen JL, Fosler-Lussier E, Lai AM. How essential are unstructured clinical narratives and information fusion to clinical trial recruitment? 6.
Meystre SM, Heider PM, Kim Y, Aruch DB, Britten CD. Automatic trial eligibility surveillance based on unstructured clinical data. Int J Med Inform. 2019;129:13–9.
Garcelon N, Neuraz A, Benoit V, Salomon R, Burgun A. Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J Am Med Inform Assoc. 2016;144.
Rosenbloom ST, Denny JC, Xu H, Lorenzi N, Stead WW, Johnson KB. Data from clinical notes: a perspective on the tension between structure and flexible documentation. J Am Med Inform Assoc. 2011;18(2):181–6.
Karami M, Rahimi A, Shahmirzadi AH. Clinical data warehouse: an effective tool to create intelligence in disease management. Health Prog. 2017;36(4):380–4.
Plantier M, Havet N, Durand T, Caquot N, Amaz C, Biron P, et al. Does adoption of electronic health records improve the quality of care management in France? Results from the French e-SI (PREPS-SIPS) study. Int J Med Inform. 2017;102:156–65.
Grammatico-Guillon L, Shea K, Jafarzadeh SR, Camelo I, Maakaroun-Vermesse Z, Figueira M, et al. Antibiotic prescribing in outpatient children: a cohort from a clinical data warehouse. Clin Pediatr (Phila). 2019;58(6):681–90.
Kang J, Kim JH, Lee KH, Lee WS, Chang HW, Kim JS, et al. Risk factor analysis of extended opioid use after coronary artery bypass grafting: a clinical data warehouse-based study. Healthcare Inform Res. 2019;25(2):124.
Jannot A-S, Zapletal E, Avillach P, Mamzer M-F, Burgun A, Degoulet P. The Georges Pompidou University hospital clinical data warehouse: a 8-years follow-up experience. Int J Med Inform. 2017;102:21–8.
Murphy SN, Mendis ME, Berkowitz DA, Kohane I, Chueh HC. Integration of clinical and genetic data in the i2b2 architecture. In: AMIA annu symposium proceedings 2006;1040.
Lowe HJ, Ferris TA, Nd PMH, Weber SC. STRIDE-an integrated standards-based translational research informatics platform 5.
Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S, et al. Secondary use of clinical data: The vanderbilt approach. J Biomed Inform. 2014;52:28–35.
Khalaf Hamoud A, Salah Hashim A, Akeel Awadh W. Clinical data warehouse a review. Ijci. 2018 [cited 2019 Jul 2];44(2).
Vydiswaran VGV, Strayhorn A, Zhao X, Robinson P, Agarwal M, Bagazinski E, et al. Hybrid bag of approaches to characterize selection criteria for cohort identification. J Am Med Inform Assoc. 2019;ocz079.
Zhou X, Wang Y, Sohn S, Therneau TM, Liu H, Knopman DS. Automatic extraction and assessment of lifestyle exposures for Alzheimer’s disease using natural language processing. Int J Med Inform. 2019;130:103943.
Hanauer DA, Mei Q, Law J, Khanna R, Zheng K. Supporting information retrieval from electronic health records: a report of University of Michigan’s nine-year experience in developing and using the electronic medical record search engine (EMERSE). J Biomed Inform. 2015;55:290–300.
Liu S, Wang Y, Wen A, Wang L, Hong N, Shen F, et al. CREATE: cohort retrieval enhanced by analysis of text from electronic health records using OMOP common data model. 14.
Garcelon N, Neuraz A, Salomon R, Faour H, Benoit V, Delapalme A, et al. A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse. J Biomed Inform. 2018;80:52–63.
Cuggia M, Garcelon N, Campillo-Gimenez B, Bernicot T, Laurent JF, Garin E, Happe A, Duvauferrier R. Roogle: an information retrieval engine for clinical data warehouse. Stud Health Technol Inform 2011;584–588.
Grosjean J, Merabti T, Griffon N, Dahamna B, Darmoni SJ. Teaching medicine with a terminology/ontology portal. Stud Health Technol Inform. 2012;180:949–53.
Lindberg DAB, Humphreys BL, McCray AT. The unified medical language system. Methods Inf Med. 1993;32:281–91.
Cabot C, Soualmia LF, Grosjean J, Griffon N, Darmoni SJ. Evaluation of the terminology coverage in the French Corpus LiSSa. Stud Health Technol Inform. 2017;235:126–30.
De Léotoing L, Barbier F, Dinh A, Breilh D, Chaize G, Vainchtock A, et al. French hospital discharge database (PMSI) and bacterial resistance: is coding adapted to hospital epidemiology? Med Mal Infect. 2018;48(7):465–73.
Perozziello A, Gauss T, Diop A, Frank-Soltysiak M, Rufat P, Raux M, et al. La codification PMSI identifie mal les traumatismes graves. Revue d’Épidémiologie et de Santé Publique. 2018;66(1):43–52.
Birman-Deych E, Waterman AD, Yan Y, Nilasena DS, Radford MJ, Gage BF. Accuracy of ICD-9-CM codes for identifying cardiovascular and stroke risk factors. Med Care. 2005;43(5):480–5.
Biron P, Metzger MH, Pezet C, Sebban C, Barthuet E, Durand T. An information retrieval system for computerized patient records in the context of a daily hospital practice: the example of the Léon Bérard Cancer Center (France). Appl Clin Inform. 2014;05(01):191–205.