Geographic bias related to geocoding in epidemiologic studies

Springer Science and Business Media LLC - Tập 4 - Trang 1-9 - 2005
M Norman Oliver1,2, Kevin A Matthews1, Mir Siadaty2, Fern R Hauck1,2, Linda W Pickle3
1Department of Family Medicine, University of Virginia, Charlottesville, USA
2Department of Public Health Sciences, University of Virginia, Charlottesville, USA
3Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, USA

Tóm tắt

This article describes geographic bias in GIS analyses with unrepresentative data owing to missing geocodes, using as an example a spatial analysis of prostate cancer incidence among whites and African Americans in Virginia, 1990–1999. Statistical tests for clustering were performed and such clusters mapped. The patterns of missing census tract identifiers for the cases were examined by generalized linear regression models. The county of residency for all cases was known, and 26,338 (74%) of these cases were geocoded successfully to census tracts. Cluster maps showed patterns that appeared markedly different, depending upon whether one used all cases or those geocoded to the census tract. Multivariate regression analysis showed that, in the most rural counties (where the missing data were concentrated), the percent of a county's population over age 64 and with less than a high school education were both independently associated with a higher percent of missing geocodes. We found statistically significant pattern differences resulting from spatially non-random differences in geocoding completeness across Virginia. Appropriate interpretation of maps, therefore, requires an understanding of this phenomenon, which we call "cartographic confounding."

Tài liệu tham khảo

Ricketts TC: Geographic Information Systems and Public Health. Annu Rev Public Health. 2003, 24: 1-6. 10.1146/annurev.publhealth.24.100901.140924. McLafferty SL: GIS and Health Care. Annu Rev Public Health. 2003, 24: 25-42. 10.1146/annurev.publhealth.24.012902.141012. Rushton G: Public Health, GIS, and Spatial Analytic Tools. Annu Rev Public Health. 2003, 24: 43-56. 10.1146/annurev.publhealth.24.012902.140843. Krieger N, Chen JT, Waterman PD, Soobader MJ, Subramanian SV, Carson R: Geocoding and monitoring of U.S. socioeconomic inequalities in mortality and cancer incidence: Does the choice of area-based measure and geographic level matter? The Public Health Disparities Geocoding Project. Am J Epidemiol. 2002, 156: 471-482. 10.1093/aje/kwf068. GK S, BA M, B H, Edwards BK: 2003, Bethesda, MD, National Cancer Institute, NIH Pub No 03-5417: Area socioeconomic variations in U.S. cancer incidence, mortality, stage, treatment, and survival, 1975-1999, NCI Cancer Surveillance Monograph Series, No. 4, Krieger N, Chen JT, Waterman PD, Rehkopf DH, Subramanian SV: Painting a truer picture of U.S. socioeconomic and racial/ethnic health inequalities: The Public Health Disparities Geocoding Project. Am J Public Health. 2005, 95: 312-323. 10.2105/AJPH.2003.032482. Pickle LW, Feuer EJ, Edwards BK: Prediction of incident cancer cases in non-SEER counties. ASA Proceedings of the Biometrics Section. 2000 Pickle LW, Feuer EJ, Edwards BK: U.S. predicted cancer incidence, 1999: Complete maps by county and state from spatial projection models. NCI Cancere Surveillance Monograph Series No. 5. 2003, Bethesda, MD, National Cancer Institute, NIH Publication No. 03-5435 McElroy JA, Remington PL, Trentham-Dietz A, Robert SA, Newcomb PA: Geocoding addresses from a large population-based study: lessons learned. Epidemiology. 2003, 14: 399-407. Boscoe FP, Kielb MS, Schymura MJ, Bolani TM: Assessing and improving census tract completeness. J Registry Management. 2002, 29: 117-123. Krieger N, Waterman PD, Lemieux K, Zierler S, Hogan JW: On the wrong side of the tracts? Evaluating the accuracy of geocoding in public health research. Am J Public Health. 2001, 91: 1114-1116. Krieger N, Waterman PD, Chen JT, Soobader MJ, Subramanian SV, Carson R: Zip code caveat: Bias due to spatitemporal mismatches between zip codes and U.S. census-defined geographic areas -- The Public Health Disparities Geocoding Project. Am J Public Health. 2002, 92: 1100-1102. Cayo MR, Talbot TO: Positional error in automated geocoding of residential addresses. Int J Health Geogr. 2003, 2: 10-10.1186/1476-072X-2-10. Bonner MR, Han D, Nie J, Rogerson P, Vena JE, Freudenheim JL: Positional accuracy of geocoded addresses in epidemiologic research. Epidemiology. 2003, 14: 408-412. Rushton G: Selecting appropriate geocoding methods for cancer control and prevention program activities. 2005, [http://www.uiowa.edu/~gishlth/giswkshp/GCD_Rushton_files/frame.htm#slide0001.htm] Whitsel EA, Rose KM, Wood JL, Henley AC, Liao D, Heiss G: Accuracy and repeatability of commercial geocoding. Am J Epidemiol. 2004, 160: 1023-1029. 10.1093/aje/kwh310. Rothman KJ, Greenland S: Modern epidemiology. 1998, Philadelphia, PA, Lippincott Williams & Wilkins, Second Vach W: Some issues in estimating the effect of prognostic factors from incomplete covariate data. Stat Med. 1997, 16: 57-72. 10.1002/(SICI)1097-0258(19970115)16:1<57::AID-SIM471>3.3.CO;2-J. EK C, McLafferty SL: GIS and public health. 2002, New York, The Guilford Press Waller LA, Gotway CA: Applied spatial statistics for public health data. 2004, Hoboken, NJ, John Wiley & Sons, Inc. Gregorio DI, Cromley E, Mrozinski R, Walsh SJ: Subject loss in spatial analysis of breast cancer. Health Place. 1999, 5: 173-177. 10.1016/S1353-8292(99)00004-0. Oliver MN, Smith E, Siadaty M, Hauck FR, Pickle LW: A spatial analysis of prostate cancer incidence and race in Virginia, 1990-1999. Am J Prev Med. 2005 Gregorio DI, Dechello LM, Samociuk H, Kulldorff M: Lumping or splitting: seeking the preferred areal unit for health geography studies. Int J Health Geogr. 2005, 4: 6-10.1186/1476-072X-4-6. Beyers N, Gie RP, Zietsman HL, Kunneke M, Hauman J, Tatley M, Donald PR: The use of a geographical information system (GIS) to evaluate the distribution of tuberculosis in a high-incidence community. S Afr Med J. 1996, 86: 40-1, 44. Miles-Doan R, Kelly S: Geographic concentration of violence between intimate partners. Public Health Rep. 1997, 112: 135-141. Rushton G, Lolonis P: Exploratory spatial analysis of birth defect rates in an urban population. Stat Med. 1996, 15: 717-726. 10.1002/(SICI)1097-0258(19960415)15:7/9<717::AID-SIM243>3.0.CO;2-0. Hurley SE, Saunders TM, Nivas R, Hertz A, Reynolds P: Post office box addresses: a challenge for geographic information system-based studies. Epidemiology. 2003, 14: 386-391. Kravets N, Hadden WC: The accuracy of address coding and the effects of coding errors. Health Place. 2005 Klassen AC, Curriero FC, Hong JH, Williams C, Kulldorff M, Meissner HI, Alberg A, Ensminger M: The role of area-level influences on prostate cancer grade and stage at diagnosis. Prev Med. 2004, 39: 441-448. 10.1016/j.ypmed.2004.04.031. PA L, MF G, DJ M, DW R: Geographic information systems and science. 2001, New York, John Wiley & Sons Centers for Disease Control and Prevention (CDC): Behavioral Risk Factor Surveillance System Survey Data. 2000, Atlanta, GA, U.S. Department of Health and Human Services, Centers for Disease Control and Prevention North American Association of Central Cancer Registries: Cancer in North America, 1988-2002. Appendix C: Indicators of data quality for all participating registries. 2005, [http://www.naaccr.org/index.asp?Col_Sectionkey=11&Col_ContentID=50=12&Col_ContentID=54] Fleiss JL: Statistical methods for rates and proportions. 1981, New York, NY, John Wiley and Sons Mungiole M, Pickle LW, Simonson KH: Application of a weighted headbanging algorithm to mortality data maps. Statistics in Medicine. 1999, 18: 3201-3209. 10.1002/(SICI)1097-0258(19991215)18:23<3201::AID-SIM310>3.3.CO;2-L. LR C: Simultaneous statistical inference in the normal multiple linear regression model. JASA. 1973, 68: 457-461. Tango T: A test for spatial disease clustering adjusted for multiple testing. Statistics in Medicine. 2000, 19: 191-204. 10.1002/(SICI)1097-0258(20000130)19:2<191::AID-SIM281>3.0.CO;2-Q. Kuldorff M, Nagarwalla N: Spatial disease clusters: detection and inference. Statistics in Medicine. 1995, 14: 799-810. Kuldorff M: A spatial scan statistic. Communications in Statistics Theory and Methods. 1997, 26: 1481-1496.