Hierarchical Bayes small area estimation for county-level health prevalence to having a personal doctor

Andreea Erciulescu1, Jianzhu Li2, Tom Krenzke1, Machell Town3
1Westat, Rockville, USA
2Financial Industry Regulatory Authority, Rockville, USA
3Population Health Surveillance Branch, Division of Population Health, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, USA

Tóm tắt

The complexity of survey data and the availability of data from auxiliary sources motivate researchers to explore estimation methods that extend beyond traditional survey-based estimation. The U.S. Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System (BRFSS) collects a wide range of health information, including whether respondents have a personal doctor. While the BRFSS focuses on state-level estimation, there is demand for county-level estimation of health indicators using BRFSS data. A hierarchical Bayes small area estimation model is developed to combine county-level BRFSS survey data with county-level data from auxiliary sources, while accounting for various sources of error and nested geographical levels. To mitigate extreme proportions and unstable survey variances, a transformation is applied to the survey data. Model-based county-level predictions are constructed for prevalence of having a personal doctor for all the counties in the U.S., including those where BRFSS survey data were not available. An evaluation study using only the counties with large BRFSS sample sizes to fit the model versus using all the counties with BRFSS data to fit the model is also presented.

Tài liệu tham khảo

Battese G, Harter R, Fuller W (1988) An error-components model for prediction of county crop areas using survey and satellite data. J Am Stat Assoc 83:28–36 Berkowitz Z, Zhang X, Richards T, Nadel M, Peipins L, Holt J (2018) Multilevel small-area estimation of colorectal cancer screening in the United States. Cancer Epidemiol Biomark Prev 27(3):245–253 Berkowitz Z, Zhang X, Richards T et al (2019) Multilevel regression for small-area estimation of mammography use in the United States. Cancer Epidemiol Biomark Prev 28(1):32–40 Browne W, Draper D (2006) A comparison of Bayesian and likelihood based methods for fitting multilevel models. Bayesian Anal 1(3):473–514 Cadwell B, Thompson T, Boyle J, Baker L (2010) Bayesian small area estimation of diabetes prevalence by U.S. county, 2005. J Data Sci 8:173–188 Erciulescu A, Cruze N, Nandram B (2020) Statistical challenges in combining survey and auxiliary data to produce official statistics. J Off Stat 36(1):63–88 Erciulescu A, Opsomer J (2019) A model-based approach to predict employee compensation components, In: Joint statistical meetings proceedings, Government Statistics Section, American Statistical Association, July 27–August 1; Alexandria, pp 1601–1623 Fabrizi E, Ferrante MR, Trivisano C (2016) Hierarchical Beta regression models for the estimation of poverty and inequality parameters in small areas. In: Analysis of poverty data by small area methods. Wiley, pp 299–314 Fay R, Herriot R (1979) Estimates of income for small places: an application of James–Stein procedures to census data. J Am Stat Assoc 74(366a):269–277 Fuller W, Goyeneche J (1998) Estimation of the state variance component. (Unpublished manuscript) Gabler S, Häder S, Lahiri P (1999) A model based justification of Kish’s formula for design effects forweighting and clustering. Surv Methodol 25:105–106 Gelman A (2006) Prior distributions for variance parameters in hierarchical models (Comment on an article by Browne and Draper). Bayesian Anal 1(3):515–534 Holt J, Matthews K, Lu H et al (2019) Small area estimates of populations with chronic conditions for community preparedness for public health emergencies. Am J Public Health 109(S4):S325–S331 Janicki R (2020) Properties of the beta regression model for small area estimation of proportions and applicationto estimation of poverty rates. Commun Stat Theor Methods 49(9):2264–2284 Kish L (1965) Survey sampling. Wiley, New York Krenzke T, Mohadjer L, Li J, et al (2020) Program for the international assessment of adult competencies (PIAAC): state and county estimation methodology report. Tech. Reports NCES2020225, U.S. Department of Education, Rockville: Westat. https://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2020225 Lahiri P, Suntornchost J (2015) Variable selection for linear mixed models with applications in small areaestimation. Sankhya B 77(2):312–320 Liu B, Parsons V, Feuer E et al (2019) Small area estimation of cancer risk factors and screening behaviors in U.S. counties by combining two large national health surveys. Prev Chronic Dis 16:E119:190013 Pierannunzi C, Xu F, Wallace R et al (2016) A methodological approach to small area estimation for the Behavioral Risk Factor Surveillance System. Prev Chronic Dis 13:E91:150480 Polson N, Scott J (2012) On the half-Cauchy prior for a global scale parameter. Bayesian Anal 7(4):887–902 Raghunathan T, Xie D, Schenker N et al (2007) Combining information from two surveys to estimate county-level prevalence rates of cancer risk factors and screening. J Am Stat Assoc 102:474–486 Torabi M, Rao J (2014) On small area estimation under a sub-area level model. J Multivar Anal 127:36–55 Watanabe S (2013) A widely applicable Bayesian information criterion. J Mach Learn Res 14:867–897 Wieczorek J, Hawala S (2011) A bayesian zero-one inflated beta model for estimating poverty in us counties. In: Proceedings of the American statistical sssociation, section on survey research methods. American Statistical Association, Alexandria, VA Zhang Z, Holt J, Lu H et al (2014) Multilevel regression and poststratification for small-area estimation of population health outcomes: a case study of chronic obstructive pulmonary disease prevalence using the Behavioral Risk Factor Surveillance System. Am J Epidemiol 179(8):1025–1033