Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations

Springer Science and Business Media LLC - Tập 118 - Trang 424-433 - 2005
Hui-Ju Tsai1,2, Shweta Choudhry1,2, Mariam Naqvi1,2, William Rodriguez-Cintron3, Esteban González Burchard1,2,4, Elad Ziv1,4
1Department of Medicine, University of California, San Francisco, USA
2Lung Biology Center, San Francisco General Hospital, San Francisco, USA
3San Juan VAMC, University of Puerto Rico School of Medicine, San Juan, USA
4Center for Human Genetics, University of California, San Francisco, USA

Tóm tắt

Population stratification may confound the results of genetic association studies among unrelated individuals from admixed populations. Several methods have been proposed to estimate the ancestral information in admixed populations and used to adjust the population stratification in genetic association tests. We evaluate the performances of three different methods: maximum likelihood estimation, ADMIXMAP and Structure through various simulated data sets and real data from Latino subjects participating in a genetic study of asthma. All three methods provide similar information on the accuracy of ancestral estimates and control type I error rate at an approximately similar rate. The most important factor in determining accuracy of the ancestry estimate and in minimizing type I error rate is the number of markers used to estimate ancestry. We demonstrate that approximately 100 ancestry informative markers (AIMs) are required to obtain estimates of ancestry that correlate with correlation coefficients more than 0.9 with the true individual ancestral proportions. In addition, after accounting for the ancestry information in association tests, the excess of type I error rate is controlled at the 5% level when 100 markers are used to estimate ancestry. However, since the effect of admixture on the type I error rate worsens with sample size, the accuracy of ancestry estimates also needs to increase to make the appropriate correction. Using data from the Latino subjects, we also apply these methods to an association study between body mass index and 44 AIMs. These simulations are meant to provide some practical guidelines for investigators conducting association studies in admixed populations.

Tài liệu tham khảo