Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions

Jingzhi Yu1, Jennifer A. Pacheco2, Anika S. Ghosh2, Yuan Luo2, Chunhua Weng3, Ning Shang3, Barbara Benoit4, David S. Carrell5, Robert J. Carroll6, Ozan Dikilitas7, Robert R. Freimuth8, Vivian S. Gainer4, Hakon Hakonarson9, George Hripcsak3, Iftikhar J. Kullo7, Frank Mentch9, Shawn N. Murphy4, Peggy L. Peissig10, Andrea H. Ramirez6, Nephi Walton11, Wei-Qi Wei6, Luke V. Rasmussen12
1Center for Health Information Partnerships (CHIP), Northwestern University Feinberg School of Medicine, Chicago, USA
2Northwestern University Feinberg School of Medicine, Chicago, USA
3Department of Biomedical Informatics, Columbia University, New York, USA
4Research IS and Computing, Massachusetts General Hospital Brigham, Somerville, USA
5Kaiser Permanente Washington Health Research Institute, Seattle, USA
6Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
7Department of Cardiovascular Medicine, Mayo Clinic, Rochester, USA
8Department of Health Sciences Research, Mayo Clinic, Rochester, USA
9Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, USA
10Biomedical Informatics Research Center, Marshfield Clinic Research Institute, Marshfield, USA
11Intermountain Precision Genomics, Intermountain Healthcare, St. George, USA
12Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, USA

Tóm tắt

Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.

Từ khóa


Tài liệu tham khảo

Pathak J, Kho AN, Denny JC. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. J Am Med Inform Assoc JAMIA. 2013;20(e2):e206–11.

Wei W-Q, Denny JC. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Med [Internet]. 2015 Apr 30 [cited 2020 Sep 9];7(1). Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4416392/

Gottesman O, Kuivaniemi H, Tromp G, Faucett WA, Li R, Manolio TA, et al. The electronic medical records and genomics (eMERGE) network: past, present, and future. Genet Med Off J Am Coll Med Genet. 2013;15(10):761–71.

McCarty CA, Chisholm RL, Chute CG, Kullo IJ, Jarvik GP, Larson EB, et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics. 2011;4:13.

Califf RM. The Patient-Centered Outcomes Research Network: a national infrastructure for comparative effectiveness research. N C Med J. 2014;75(3):204–10.

Liao KP, Sun J, Cai TA, Link N, Hong C, Huang J, et al. High-throughput multimodal automated phenotyping (MAP) with application to PheWAS. J Am Med Inform Assoc. 2019;26(11):1255–62.

Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational health data sciences and informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inform. 2015;216:574–8.

Rasmussen LV, Brandt PS, Jiang G, Kiefer RC, Pacheco JA, Adekkanattu P, et al. Considerations for improving the portability of electronic health record-based phenotype algorithms. AMIA Annu Symp Proc AMIA Symp. 2019;2019:755–64.

Codish S, Shiffman RN. A model of ambiguity and vagueness in clinical practice guideline recommendations. AMIA Annu Symp Proc. 2005;2005:146–50.

Hruby GW, Boland MR, Cimino JJ, Gao J, Wilcox AB, Hirschberg J, et al. Characterization of the biomedical query mediation process. AMIA Jt Summits Transl Sci Proc AMIA Jt Summits Transl Sci. 2013;2013:89–93.

Hruby GW, Rasmussen LV, Hanauer D, Patel VL, Cimino JJ, Weng C. A multi-site cognitive task analysis for biomedical query mediation. Int J Med Inf. 2016;93:74–84.

Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc JAMIA. 2016;23(6):1046–52.

Berry DM, Kamsties E. Ambiguity in requirements specification. In: do Prado Leite JCS, Doorn JH, editors. Perspectives on software requirements [Internet]. Boston, MA: Springer US; 2004 [cited 2020 Jul 15]. p. 7–44. (The Springer International Series in Engineering and Computer Science). Available from: https://doi.org/10.1007/978-1-4615-0465-8_2

Wilson WM, Rosenberg LH, Hyatt LE. Automated analysis of requirement specifications. In: Proceedings of the 19th international conference on Software engineering [Internet]. Boston, Massachusetts, USA: Association for Computing Machinery; 1997 [cited 2020 Jul 15]. p. 161–171. (ICSE ’97). Available from: https://doi.org/10.1145/253228.253258

Wilson WM. Writing effective natural language requirements specifications. Crosstalk, The Journal of Defense Software Engineering.1999;16–19.

Gleich B, Creighton O, Kof L. Ambiguity Detection: Towards a Tool Explaining Ambiguity Sources. In: Wieringa R, Persson A, editors. Requirements Engineering: Foundation for Software Quality. Berlin, Heidelberg: Springer; 2010. p. 218–32. (Lecture Notes in Computer Science).

Fabbrini F, Fusani M, Gnesi S, Lami G. An automatic quality evaluation for natural language requirements. 7th Intl Workshop on RE: Found for Soft Qual (REFSQ’2001). 2001:4–5.

Hanauer DA, Liu Y, Mei Q, Manion FJ, Balis UJ, Zheng K. Hedging their mets: the use of uncertainty terms in clinical documents and its potential implications when sharing the documents with patients. AMIA Annu Symp Proc. 2012;3(2012):321–30.

Yuan C, Ryan PB, Ta C, Guo Y, Li Z, Hardin J, et al. Criteria2Query: a natural language interface to clinical databases for cohort definition. J Am Med Inform Assoc JAMIA. 2019;26(4):294–305.

Fleurence RL, Curtis LH, Califf RM, Platt R, Selby JV, Brown JS. Launching PCORnet, a national patient-centered clinical research network. J Am Med Inform Assoc JAMIA. 2014;21(4):578–82.

Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc JAMIA. 2010;17(2):124–30.

Platt R, Brown JS, Robb M, McClellan M, Ball R, Nguyen MD, et al. The FDA sentinel initiative: an evolving national resource. N Engl J Med [Internet]. 2018 Nov 28 [cited 2020 Jul 22]; doi:https://doi.org/10.1056/NEJMp1809643

Ross TR, Ng D, Brown JS, Pardee R, Hornbrook MC, Hart G, et al. The HMO research network virtual data warehouse: a public data model to support collaboration. EGEMS Wash DC. 2014;2(1):1049.

Hripcsak G, Shang N, Peissig PL, Rasmussen LV, Liu C, Benoit B, et al. Facilitating phenotype transfer using a common data model. J Biomed Inform. 2019;96:103253.

Health Level 7. Clinical Quality Language Release 1 STU 4 (1.4): 2. Author’s Guide [Internet]. 2020 [cited 2020 Jul 22]. Available from: https://cql.hl7.org/02-authorsguide.html#patient-operators