Information from Searching Content with an Ontology-Utilizing Toolkit (iSCOUT)

Journal of Digital Imaging - Tập 25 - Trang 512-519 - 2012
Ronilda Lacson1,2, Katherine P. Andriole1, Luciano M. Prevedello1, Ramin Khorasani1
1Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, USA
2Brookline, USA

Tóm tắt

Radiology reports are permanent legal documents that serve as official interpretation of imaging tests. Manual analysis of textual information contained in these reports requires significant time and effort. This study describes the development and initial evaluation of a toolkit that enables automated identification of relevant information from within these largely unstructured text reports. We developed and made publicly available a natural language processing toolkit, Information from Searching Content with an Ontology-Utilizing Toolkit (iSCOUT). Core functions are included in the following modules: the Data Loader, Header Extractor, Terminology Interface, Reviewer, and Analyzer. The toolkit enables search for specific terms and retrieval of (radiology) reports containing exact term matches as well as similar or synonymous term matches within the text of the report. The Terminology Interface is the main component of the toolkit. It allows query expansion based on synonyms from a controlled terminology (e.g., RadLex or National Cancer Institute Thesaurus [NCIT]). We evaluated iSCOUT document retrieval of radiology reports that contained liver cysts, and compared precision and recall with and without using NCIT synonyms for query expansion. iSCOUT retrieved radiology reports with documented liver cysts with a precision of 0.92 and recall of 0.96, utilizing NCIT. This recall (i.e., utilizing the Terminology Interface) is significantly better than using each of two search terms alone (0.72, p = 0.03 for liver cyst and 0.52, p = 0.0002 for hepatic cyst). iSCOUT reliably assembled relevant radiology reports for a cohort of patients with liver cysts with significant improvement in document retrieval when utilizing controlled lexicons.

Tài liệu tham khảo

Taira RK, Soderland SG, Jakobovits RM: Automatic structuring of radiology free-text reports. Radiographics 21(1):237–245, 2001 Mamlin BW, Heinze DT, McDonald CJ. Automated extraction and normalization of findings from cancer-related free-text radiology reports. AMIA Annu Symp Proc 420–424, 2003 Zingmond D, Lenert LA: Monitoring free-text data using medical language processing. Comput Biomed Res 26(5):467–481, 1993 Fiszman M, Haug PJ, Frederick PR. Automatic extraction of PIOPED interpretations from ventilation/perfusion lung scan reports. Proc AMIA Symp 860–864, 1998 Thomas BJ, Ouellette H, Halpern EF, Rosenthal DI: Automated computer-assisted categorization of radiology reports. AJR Am J Roentgenol 184(2):687–690, 2005 Dreyer KJ, Kalra MK, Maher MM, Hurier AM, Asfaw BA, Schultz T, et al: Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study. Radiology 234(2):323–329, 2005 Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB: A general natural-language text processor for clinical radiology. J Am Med Inform Assoc 1(2):161–174, 1994 Pines JM: Trends in the rates of radiography use and important diagnoses in emergency department patients with abdominal pain. Med Care 47(7):782–786, 2009 Korley FK, Pham JC, Kirsch TD: Use of advanced radiology during visits to US emergency departments for injury-related conditions, 1998–2007. JAMA 304(13):1465–1471, 2010 Meystre SM, Haug PJ. Comparing natural language processing tools to extract medical problems from narrative text. AMIA Annu Symp Proc 525–529, 2005 Xu H, Fu Z, Shah A, Chen Y, Peterson NB, Chen Q, et al: Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annu Symp Proc 2011:1564–1572, 2011 Uzuner O, South BR, Shen S, Duvall SL: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc 18(5):552–556, 2011 Meystre S, Haug PJ: Natural language processing to extract medical problems from electronic clinical documents: performance evaluation. J Biomed Inform 39(6):589–599, 2006 Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R: Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak 6:30, 2006 Cunningham H, D Maynard, K Bontcheva, V Tablan. GATE: A framework and graphical development environment for robust NLP tools and applications. Proc 40th Assoc for Computational Linguistics, 2002 de Coronado S, Haber MW, Sioutos N, Tuttle MS, Wright LW: NCI Thesaurus: using science-based terminology to integrate cancer research results. Stud Health Technol Inform 107(Pt 1):33–37, 2004 Langlotz CP: RadLex: a new method for indexing online educational materials. Radiographics 26(6):1595–1597, 2006 Andriole KP, Khorasani R: Implementing a replacement PACS: issues to consider. J Am Coll Radiol 4(6):416–418, 2007 Gershanik EF, Lacson R, Khorasani R: Critical finding capture in the impression section of radiology reports. AMIA Annu Symp Proc 2011:465–469, 2011 National Cancer Institute. http://ncit.nci.nih.gov. 26 July 2010. Hersh W: Evaluation of biomedical text-mining systems: lessons learned from information retrieval. Brief Bioinform 6(4):344–356, 2005 Su K, Ries JE, Peterson GM, Cullinan Sievert ME, Patrick TB, Moxley DE et al. Comparing frequency of word occurrences in abstracts and texts using two stop word lists. Proc AMIA Symp 682–686, 2001 Nadkarni PM, Ohno-Machado L, Chapman WW: Natural language processing: an introduction. J Am Med Inform Assoc 18(5):544–551, 2011 Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG: A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 34(5):301–310, 2001 Lindberg DA, Humphreys BL, McCray AT: The unified medical language system. Methods Inf Med 32(4):281–291, 1993 Loy P: International classification of diseases—9th revision. Med Rec Health Care Inf J 19(2):390–396, 1978 Cote RA, Robboy S: Progress in medical information management. Systematized nomenclature of medicine (SNOMED). JAMA 243(8):756–762, 1980 Rogers FB: Medical subject headings. Bull Med Libr Assoc 51:114–116, 1963 Cheng LT, Zheng J, Savova GK, Erickson BJ: Discerning tumor status from unstructured MRI reports—completeness of information in existing reports and utility of automated natural language processing. J Digit Imaging 23(2):119–132, 2010 Cheng B, Titterington D: Neural networks: a review from a statistical perspective. Stat Sci 9(1):2–54, 1994 Savova GK, Fan J, Ye Z, Murphy SP, Zheng J, Chute CG, et al: Discovering peripheral arterial disease cases from radiology notes using natural language processing. AMIA Annu Symp Proc 2010:722–726, 2010 Warden GI, Lacson R, Khorasani R: Leveraging terminologies for retrieval of radiology reports with critical imaging findings. AMIA Annu Symp Proc 2011:1481–1488, 2011