Validating, augmenting and refining genome-wide association signals

Nature Reviews Genetics - Tập 10 Số 5 - Trang 318-329 - 2009
John P. A. Ioannidis1, Gilles Thomas2, Mark J. Daly3
1Department of Hygiene and Epidemiology, Clinical and Molecular Epidemiology Unit, University of Ioannina School of Medicine and Biomedical Research Institute, Foundation for Research and Technology — Hellas, Ioannina, 45110, Greece
2Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, Bethesda, 20892, Maryland, USA
3Center for Human Genetic Research, Massachusetts General Hospital, Richard B. Simches Research Center, Boston, 02114, Massachusetts, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

McCarthy, M. I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 9, 356–369 (2008). A comprehensive review of challenges in the discovery of associations using GWA studies.

Manolio, T. A., Brooks, L. D. & Collins, F. S. A HapMap harvest of insights into the genetics of common disease. J. Clin. Invest. 118, 1590–1605 (2008).

Janssens, A. C. & van Duijn, C. M. Genome-based prediction of common diseases: advances and prospects. Hum. Mol. Genet. 17, R166–R173 (2008).

Hoggart, C. J., Clark, T. G., De Iorio, M., Whittaker, J. C. & Balding, D. J. Genome-wide significance for dense SNP and resequencing data. Genet. Epidemiol. 32, 179–185 (2008).

Pe'er, I., Yelensky, R., Altshuler, D. & Daly, M. J. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet. Epidemiol. 32, 381–385 (2008).

Clarke, G. M., Carter, K. W., Palmer, L. J., Morris, A. P. & Cardon, L. R. Fine mapping versus replication in whole-genome association studies. Am. J. Hum. Genet. 81, 995–1005 (2007).

Hindorff, L. A., Junkins, H. A., Mehta, J. P. & Manolio, T. A. A Catalog of Published Genome-Wide Association Studies. National Human Genome Research Institute [online] http://www.genome.gov/26525384, (2009). A continuously updated online list of GWA studies and their main results.

Altshuler, D., Daly, M. J & Lander, E. S. Genetic mapping in human disease. Science 322, 881–888 (2008).

Zeggini, E. & Ioannidis, J. P. A. Meta-analysis of genome-wide association studies. Pharmacogenomics 10, 191–201 (2009).

de Bakker, P. I. et al. Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet. 17, R122–R128 (2008).

Zeggini, E. et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nature Genet. 40, 638–645 (2008). An early paradigm of the application of meta-analysis in combining several GWA data sets and subsequent replication studies.

Barrett, J. C. et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nature Genet. 40, 955–962 (2008).

The GIANT consortium. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nature Genet. 41, 25–34 (2009).

Seminara, D. et al. The emergence of networks in human genome epidemiology: challenges and opportunities. Epidemiology 18, 1–8 (2007).

Pahl, R., Schäfer, H. & Müller, H. H. Optimal multistage designs—a general framework for efficient genome-wide association studies. Biostatistics 10, 297–309 (2009).

Gail, M. H., Pfeiffer, R. M., Wheeler, W. & Pee, D. Probability that a two-stage genome-wide association study will detect a disease-associated SNP and implications for multistage designs. Ann. Hum. Genet. 72, 812–820 (2008).

Skol, A. D., Scott, L. J., Abecasis, G. R. & Boehnke, M. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nature Genet. 38, 209–213 (2006).

Nothnagel, M., Ellinghaus, D., Schreiber, S., Krawczak, M. & Franke, A. A comprehensive evaluation of SNP genotype imputation. Hum. Genet. 125, 163–171 (2009).

Guan, Y. & Stephens, M. Practical issues in imputation-based association mapping. PLoS Genet. 4, e1000279 (2008).

Marchini, J. et al. A new multipoint method for genome-wide association studies by imputation of genotypes. Nature Genet. 39, 906–913 (2007).

Browning, S. R. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008).

Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).

Trikalinos, T. A., Salanti, G., Zintzaras, E. & Ioannidis, J. P. Meta-analysis methods. Adv. Genet. 60, 311–334 (2008).

Kavvoura, F. K. & Ioannidis, J. P. Methods for meta-analysis in genetic association studies: a review of their potential and pitfalls. Hum. Genet. 123, 1–14 (2008).

Sutton, A. J., Abrams, K. R., Jones, D. R., Sheldon, T. A. & Song, F. Methods for Meta-Analysis in Medical Research (Wiley, Chichester, 2000).

Sutton, A. J. & Higgins, J. P. Recent developments in meta-analysis. Stat. Med. 27, 625–650 (2008).

Spiegelhalter, D. J., Abrams, K. R. & Myles, P. J. Bayesian Approaches to Clinical Trials and Health-Care Evaluation Ch. 8, 267–305 (Wiley, Chichester, 2004).

Salanti, G., Higgins, J. P., Trikalinos, T. A. & Ioannidis, J. P. Bayesian meta-analysis and meta-regression for gene–disease associations and deviations from Hardy–Weinberg equilibrium. Stat. Med. 26, 553–567 (2007).

Thorlund, K., et al. Can trial sequential monitoring boundaries reduce spurious inferences from meta-analyses? Int. J. Epidemiol. 38, 276–286 (2009).

Zollner, S. & Pritchard, J. K. Overcoming the winner's curse: estimating penetrance parameters from case–control data. Am. J. Hum. Genet. 80, 605–615 (2007). A thorough presentation of the winner's curse and of the proposed approach for correcting for it.

Ioannidis, J. P. Why most discovered true associations are inflated. Epidemiology 19, 640–648 (2008).

Moonesinghe, R., Khoury, M. J., Liu, T. & Ioannidis, J. P. Required sample size and nonreplicability thresholds for heterogeneous genetic associations. Proc. Natl Acad. Sci. USA 105, 617–622 (2008).

Ioannidis, J. P., Patsopoulos, N. A. & Evangelou, E. Uncertainty in heterogeneity estimates in meta-analyses. BMJ 335, 914–916 (2007).

Ioannidis, J. P. Non-replication and inconsistency in the genome-wide association setting. Hum. Hered. 64, 203–213 (2007).

Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nature Genet. 38, 904–909 (2006).

Kavvoura, F. K. et al. Evaluation of the potential excess of statistically significant findings in published genetic association studies: application to Alzheimer's disease. Am. J. Epidemiol. 168, 855–865 (2008).

Slatkin, M. Linkage disequilibrium — understanding the evolutionary past and mapping the medical future. Nature Rev. Genet. 9, 477–485 (2008).

International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

McCarroll, S. A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nature Genet. 40, 1166–1174 (2008).

Ioannidis, J. P., Ntzani, E. E. & Trikalinos, T. A. 'Racial' differences in genetic effects for complex diseases. Nature Genet. 36, 1312–1318 (2004).

Easton, D. F. et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447, 1087–1093 (2007).

Ng, M. C. et al. Implication of genetic variants near TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/B, IGF2BP2, and FTO in type 2 diabetes and obesity in 6,719 Asians. Diabetes 57, 2226–2233 (2008).

Gudbjartsson, D. F. et al. Variants conferring risk of atrial fibrillation on chromosome 4q25. Nature 448, 353–357 (2007).

Grant, S. F. et al. Association analysis of the FTO gene with obesity in children of Caucasian and African ancestry reveals a common tagging SNP. PLoS ONE 3, e1746 (2008).

Li, H. et al. Variants in the fat mass- and obesity-associated (FTO) gene are not associated with obesity in a Chinese Han population. Diabetes 57, 264–268 (2008).

Grant, S. F. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nature Genet. 38, 320–323 (2006).

Helgason, A. et al. Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution. Nature Genet. 39, 218–225 (2007).

Terwilliger, J. D. & Hiekkalina, T. An utter refutation of the 'Fundamental Theorem of the HapMap'. Eur. J. Hum. Genet. 14, 426–437 (2006).

Thomas, D. & Stram, D. An utter refutation of the 'Fundamental Theorem of the HapMap' by Terwilliger and Hiekkalina. Eur. J. Hum. Genet. 14, 1238–1239 (2006).

Nunnally, J. C. Introduction to Psychological Measurement (McGraw–Hill, New York, 1970).

Nath, S. K. et al. A nonsynonymous functional variant in integrin-αM (encoded by ITGAM) is associated with systemic lupus erythematosus. Nature Genet. 40, 152–154 (2008).

Amundadottir, L. T. et al. A common variant associated with prostate cancer in European and African populations. Nature Genet. 38, 652–658 (2006).

Freedman, M. L. et al. Admixture mapping identifies 8q24 as a prostate cancer risk locus in African–American men. Proc. Natl Acad. Sci. USA 103, 14068–14073 (2006).

Yeager, M. et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nature Genet. 39, 645–649 (2007).

Haiman, C. A. et al. Multiple regions within 8q24 independently affect risk for prostate cancer. Nature Genet. 39, 638–644 (2007).

Zanke, B. W. et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nature Genet. 39, 989–994 (2007).

Ghoussaini, M. et al. Multiple loci with different cancer specificities within the 8q24 gene desert. J. Natl. Cancer Inst. 100, 962–966 (2008).

Gudmundsson, J. et al. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nature Genet. 39, 631–637 (2007).

Kiemeney, L. A. et al. Sequence variant on 8q24 confers susceptibility to urinary bladder cancer. Nature Genet. 40, 1307–1312 (2008).

Wokolorczyk, D. et al. A range of cancers is associated with the rs6983267 marker on chromosome 8. Cancer Res. 68, 9982–9986 (2008).

Park, S. L. et al. Associations between variants of the 8q24 chromosome and nine smoking-related cancer sites. Cancer Epidemiol. Biomarkers Prev. 17, 3193–3202 (2008).

Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005).

Veyrieras, J. B. et al. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 4, e1000214 (2008).

Petretto, E. et al. Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet. 2, e172 (2006).

Libouille, C. et al. Novel Crohn disease locus identified by genome-wide association maps to a gene desert on 5p13.1 and modulates expression of PTGER4. PLoS Genet. 3, e58 (2007).

International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007). A description of the second generation of the HapMap.

Voelkerding, K. V., Dames, S. A. & Durtschi, J. D. Next-generation sequencing: from basic research to diagnostics. Clin. Chem. 26 Feb 2009 (doi:10.1373/clinchem.2008.112789).

Wang, J. et al. The diploid genome sequence of an Asian individual. Nature 456, 60–65 (2008).

Nyholt, D. R. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am. J. Hum. Genet. 74, 765–769 (2004).

Li, J. & Ji, L. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity 95, 221–227 (2005).

Lin, D. Y. An efficient Monte Carlo approach to assessing statistical significance in genomic studies. Bioinformatics 21, 781–787 (2005).

McCarroll, S. A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nature Genet. 40, 1107–1120 (2008).

Gorlov, I. P., Gorlova, O. Y., Sunyaev, S. R., Spitz, M. R. & Amos, C. I. Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. Am. J. Hum. Genet. 82, 100–112 (2008).

Kryukov, G. V., Pennacchio, L. A. & Sunyaev, S. R. Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am. J. Hum. Genet. 80, 727–739 (2007).

Yeo, G. S. et al. Mutations in the human melanocortin-4 receptor gene associated with severe familial obesity disrupts receptor function through multiple molecular mechanisms. Hum. Mol. Genet. 12, 561–574 (2003).

Cohen, J. C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004).

Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423, 506–511 (2003).

Harrell, F. E. Jr, Lee, K. L. & Mark, D. B. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 15, 361–387 (1996).

Stephens, M. & Donnelly, P. A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am. J. Hum. Genet. 73, 1162–1169 (2003).

Graham, R. R. et al. Three functional variants of IFN regulatory factor 5 (IRF5) define risk and protective haplotypes for human lupus. Proc. Natl Acad. Sci. USA 104, 6758–6763 (2007).

Sigurdsson, S. et al. Comprehensive evaluation of the genetic variants of interferon regulatory factor 5 (IRF5) reveals a novel 5 bp length polymorphism as strong risk factor for systemic lupus erythematosus. Hum. Mol. Genet. 17, 872–881 (2008).

Shin, H. D. et al. Different genetic effects of interferon regulatory factor 5 (IRF5) polymorphisms on systemic lupus erythematosus in a Korean population. J. Rheumatol. 35, 2148–2151 (2008).

Kawasaki, A. et al. Association of IRF5 polymorphisms with systemic lupus erythematosus in a Japanese population: support for a crucial role of intron 1 polymorphisms. Arthritis Rheum. 58, 826–834 (2008).

Li, M. et al. CFH haplotypes without the Y402H coding variant show strong association with susceptibility to age-related macular degeneration. Nature Genet. 38, 1049–1054 (2006).

Maller, J. et al. Common variation in three genes, including a noncoding variant in CFH, strongly influences risk of age-related macular degeneration. Nature Genet. 38, 1055–1059 (2006).

Mori, K. et al. Coding and noncoding variants in the CFH gene and cigarette smoking influence the risk of age-related macular degeneration in a Japanese population. Invest. Ophthalmol. Vis. Sci. 48, 5315–5319 (2007).

Minelli, C., Thompson, J. R., Abrams, K. R. & Lambert, P. C. Bayesian implementation of a genetic model-free approach to the meta-analysis of genetic association studies. Stat. Med. 24, 3845–3861 (2005).

Risch, N. & Botstein, D. Discovering genotypes underlying human phenotypes: past successes for Mendelian disease, future approaches for complex disease. Nature Genet. 33 (Suppl.), 228–237 (2003).

Warner, J. B. et al. Systematic identification of mammalian regulatory motifs' target genes and function. Nature Methods 5, 347–353 (2008).

Tompa, M. et al. Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotechnol. 23, 137–144 (2005).

Kariuki, S. N. et al. Autoimmune disease risk variant of STAT4 confers increased sensitivity to IFN-α in lupus patients in vivo. J. Immunol. 182, 34–38 (2009).

Kuballa, P., Huett, A., Rioux, J. D., Daly, M. J. & Xavier, R. Impaired autophagy of an intracellular pathogen induced by a Crohn's disease associated ATG16L1 variant. PLoS ONE 3, e3391 (2008).

Ogura, Y. et al. Genetic variation and activity of mouse Nod2, a susceptibility gene for Crohn's disease. Genomics 81, 369–377 (2003).

Shen S. et al. Schizophrenia-related neural and behavioural phenotypes in transgenic mice expressing truncated Disc1. J. Neurosci. 28, 10893–10904 (2008).

Ioannidis J. P. & Kavvoura F. K. Concordance of functional in vitro data and epidemiological associations in complex disease genetics. Genet. Med. 8, 583–593 (2006).

Martin, L. J. et al. Phenotypic, genetic, and genome-wide structure in the metabolic syndrome. BMC Genet. 4 (Suppl. 1), S95 (2003).

Aukes, M. F. et al. Genetic overlap among intelligence and other candidate endophenotypes for schizophrenia. Biol. Psychiatry. 65, 527–534 (2009).

Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336–1341 (2007).

Frayling, T. M. et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316, 889–894 (2007).

Ioannidis, J. P., Patsopoulos, N. A. & Evangelou, E. Heterogeneity in meta-analyses of genome-wide association investigations. PLoS ONE 2, e841 (2007).

Toulopoulou, T. et al. Substantial genetic overlap between neurocognition and schizophrenia: genetic modeling in twin samples. Arch. Gen. Psychiatry 64, 1348–1355 (2007).

Bottini, N., Vang, T., Cucca, F. & Mustelin, T. Role of PTPN22 in type 1 diabetes and other autoimmune diseases. Semin. Immunol. 18, 207–213 (2006).

Kavvoura, F. K. et al. Cytotoxic T-lymphocyte associated antigen 4 gene polymorphisms and autoimmune thyroid disease: a meta-analysis. J. Clin. Endocrinol. Metab. 92, 3162–3170 (2007).

Kavvoura, F. K. & Ioannidis, J. P. CTLA-4 gene polymorphisms and susceptibility to type 1 diabetes mellitus: a HuGE Review and meta-analysis. Am. J. Epidemiol. 162, 3–16 (2005).

Gudmundsson, J. et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nature Genet. 39, 977–983 (2007).

Orho-Melander, M. et al. A common missense variant in the glucokinase regulatory protein gene (GCKR) is associated with increased plasma triglyceride and C-reactive protein but lower fasting glucose concentrations. Diabetes 57, 3112–3121 (2008).

Wojczynski, M. K. & Tiwari, H. K. Definition of phenotype. Adv. Genet. 60, 75–105 (2008).

Viswesvaran, C. & Ones, D. S. Measurement error in “Big Five Factors” personality assessment: reliability generalization across studies and measures. Educ. Psychol. Meas. 60, 224–235 (2000).

Dina, C. et al. Variation in FTO contributes to childhood obesity and severe adult obesity. Nature Genet. 39, 724–726 (2007).

Contopoulos-Ioannidis, D. G., Alexiou, G. A., Gouvias, T. C. & Ioannidis, J. P. An empirical evaluation of multifarious outcomes in pharmacogenetics: β2 adrenoceptor gene polymorphisms in asthma treatment. Pharmacogenet. Genomics 16, 705–711 (2006).

Goh, K. I. et al. The human disease network. Proc. Natl Acad. Sci. USA 104, 8685–8690 (2007).

Lage, K. et al. A human phenome–interactome network of protein complexes implicated in genetic disorders. Nature Biotechnol. 25, 309–316 (2007).

van Driel, M. A., Bruggeman, J., Vriend, G., Brunner, H. G. & Leunissen, J. A. A text-mining analysis of the human phenome. Eur. J. Hum. Genet. 14, 535–542 (2006).

Wild, C. P. Complementing the genome with an “exposome”: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol Biomarkers Prev. 14, 1847–1850 (2005).

Garcia-Closas, M. et al. Heterogeneity of breast cancer associations with five susceptibility loci by clinical and pathological characteristics. PLoS Genet. 4, e1000054 (2008).

NCI–NHGRI Working Group on Replication in Association Studies. Replicating genotype–phenotype associations. Nature 447, 655–660 (2007).

Ioannidis, J. P. Molecular evidence-based medicine: evolution and integration of information in the genomic era. Eur. J. Clin. Invest. 37, 340–349 (2007).

Mailman, M. D. et al. The NCBI dbGaP database of genotypes and phenotypes. Nature Genet. 39, 1181–1186 (2007).

GAIN Collaborative Research Group. New models of collaboration in genome-wide association studies: the Genetic Association Information Network. Nature Genet. 39, 1045–1051 (2007).