Shrinkage observed-to-expected ratios for robust and transparent large-scale pattern discovery

Statistical Methods in Medical Research - Tập 22 Số 1 - Trang 57-69 - 2013
G. Niklas Norén1,2, Johan Hopstadius1, Andrew Bate1,3
1#N# 1Uppsala Monitoring Centre, WHO Collaborating Centre for International Drug Monitoring, Uppsala, Sweden
2#N# 2Department of Mathematics, Stockholm University, Stockholm, Sweden
3#N# 3School of Information Systems, Computing and Mathematics, Brunel University, London, UK

Tóm tắt

Large observational data sets are a great asset to better understand the effects of medicines in clinical practice and, ultimately, improve patient care. For an empirical pattern in observational data to be of practical relevance, it should represent a substantial deviation from the null model. For the purpose of identifying such deviations, statistical significance tests are inadequate, as they do not on their own distinguish the magnitude of an effect from its data support. The observed-to-expected (OE) ratio on the other hand directly measures strength of association and is an intuitive basis to identify a range of patterns related to event rates, including pairwise associations, higher order interactions and temporal associations between events over time. It is sensitive to random fluctuations for rare events with low expected counts but statistical shrinkage can protect against spurious associations. Shrinkage OE ratios provide a simple but powerful framework for large-scale pattern discovery. In this article, we outline a range of patterns that are naturally viewed in terms of OE ratios and propose a straightforward and effective statistical shrinkage transformation that can be applied to any such ratio. The proposed approach retains emphasis on the practical relevance and transparency of highlighted patterns, while protecting against spurious associations.

Từ khóa


Tài liệu tham khảo

10.1007/s002280050466

DuMouchel W, 1999, Am Stat, 53, 177, 10.1080/00031305.1999.10474456

10.1016/S0140-6736(05)70271-3

10.1016/S0140-6736(08)61333-1

10.1080/0266476042000270518

10.1145/502512.502526

10.1002/sim.2473

10.1002/sim.3247

10.1007/s10618-009-0152-3

10.1002/pds.677

10.1145/775047.775053

10.1177/009286150804200501

10.1093/oxfordjournals.aje.a113015

10.2165/00002018-200831110-00008

James W, 1961, Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 361

10.1002/bimj.200610296

10.1007/s10994-007-5006-x

10.1038/clpt.2010.111

10.1111/j.1365-2125.1982.tb01987.x

10.1007/s10618-006-0052-8

10.2165/00124363-200721050-00001

10.1198/004017007000000245

10.1002/sam.10078