Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test

The American Journal of Human Genetics - Tập 89 - Trang 82-93 - 2011
Michael C. Wu1, Seunggeun Lee2, Tianxi Cai2, Yun Li1,3, Michael Boehnke4, Xihong Lin2
1Department of Biostatistics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
2Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
3Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
4Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA

Tài liệu tham khảo

Hindorff, 2009, Potential etiologic and functional implications of genome-wide association loci for human diseases and traits, Proc. Natl. Acad. Sci. USA, 106, 9362, 10.1073/pnas.0903103106 Margulies, 2005, Genome sequencing in microfabricated high-density picolitre reactors, Nature, 437, 376, 10.1038/nature03959 Mardis, 2008, Next-generation DNA sequencing methods, Annu. Rev. Genomics Hum. Genet., 9, 387, 10.1146/annurev.genom.9.081307.164359 Ansorge, 2009, Next-generation DNA sequencing techniques, New Biotechnol., 25, 195, 10.1016/j.nbt.2008.12.009 Eichler, 2010, Missing heritability and strategies for finding the underlying causes of complex disease, Nat. Rev. Genet., 11, 446, 10.1038/nrg2809 Ley, 2008, DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome, Nature, 456, 66, 10.1038/nature07485 Li, 2008, Mapping short DNA sequencing reads and calling variants using mapping quality scores, Genome Res., 18, 1851, 10.1101/gr.078212.108 Li, 2009, SNP detection for massively parallel whole-genome resequencing, Genome Res., 19, 1124, 10.1101/gr.088013.108 Bansal, 2010, Accurate detection and genotyping of SNPs utilizing population sequencing data, Genome Res., 20, 537, 10.1101/gr.100040.109 Carvajal-Carmona, 2010, Challenges in the identification and use of rare disease-associated predisposition variants, Curr. Opin. Genet. Dev., 20, 277, 10.1016/j.gde.2010.05.005 Schork, 2009, Common vs. rare allele hypotheses for complex diseases, Curr. Opin. Genet. Dev., 19, 212, 10.1016/j.gde.2009.04.010 Li, 2008, Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data, Am. J. Hum. Genet., 83, 311, 10.1016/j.ajhg.2008.06.024 Madsen, 2009, A groupwise association test for rare mutations using a weighted sum statistic, PLoS Genet., 5, e1000384, 10.1371/journal.pgen.1000384 Morgenthaler, 2007, A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST), Mutat. Res., 615, 28, 10.1016/j.mrfmmm.2006.09.003 Li, 2009, Discovery of rare variants via sequencing: implications for the design of complex trait association studies, PLoS Genet., 5, e1000481, 10.1371/journal.pgen.1000481 Price, 2010, Pooled association tests for rare variants in exon-resequencing studies, Am. J. Hum. Genet., 86, 832, 10.1016/j.ajhg.2010.04.005 Han, 2010, A data-adaptive sum test for disease association with multiple common or rare variants, Hum. Hered., 70, 42, 10.1159/000288704 Morris, 2010, An evaluation of statistical approaches to rare variant analysis in genetic association studies, Genet. Epidemiol., 34, 188, 10.1002/gepi.20450 Zawistowski, 2010, Extending rare-variant testing strategies: analysis of noncoding sequence and imputed genotypes, Am. J. Hum. Genet., 87, 604, 10.1016/j.ajhg.2010.10.012 Asimit, 2010, Rare variant association analysis methods for complex traits, Annu. Rev. Genet., 44, 293, 10.1146/annurev-genet-102209-163421 Neale, 2011, Testing for an unusual distribution of rare variants, PLoS Genet., 7, e1001322, 10.1371/journal.pgen.1001322 Price, 2006, Principal components analysis corrects for stratification in genome-wide association studies, Nat. Genet., 38, 904, 10.1038/ng1847 Kwee, 2008, A powerful and flexible multilocus association test for quantitative traits, Am. J. Hum. Genet., 82, 386, 10.1016/j.ajhg.2007.10.010 Wu, 2010, Powerful SNP-set analysis for case-control genome-wide association studies, Am. J. Hum. Genet., 86, 929, 10.1016/j.ajhg.2010.05.002 Lin, 1997, Variance component testing in generalised linear models with random effects, Biometrika, 84, 309, 10.1093/biomet/84.2.309 Davies, 1980, The distribution of a linear combination of chi-square random variables, J. R. Stat. Soc. Ser. C Appl. Stat., 29, 323 Pan, 2009, Asymptotic tests of association with multiple SNPs in linkage disequilibrium, Genet. Epidemiol., 33, 497, 10.1002/gepi.20402 Cristianini, 2000 Liu, 2007, Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models, Biometrics, 63, 1079, 10.1111/j.1541-0420.2007.00799.x Liu, 2008, Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models, BMC Bioinformatics, 9, 292, 10.1186/1471-2105-9-292 Fleuret, F., and Sahbi, H. (2003). Scale-invariance of support vector machines based on the triangular kernel. In 3rd International Workshop on Statistical and Computational Theories of Vision. (ftp://ftp.inria.fr/INRIA/publication/publi-pdf/RR/RR-4601.pdf). Ramensky, 2002, Human non-synonymous SNPs: server and survey, Nucleic Acids Res., 30, 3894, 10.1093/nar/gkf493 Kumar, 2009, Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm, Nat. Protoc., 4, 1073, 10.1038/nprot.2009.86 Liu, 2009, A new chi-square approximation to the distribution of non-negative definite quadratic forms in non-central normal variables, Comput. Stat. Data Anal., 53, 853, 10.1016/j.csda.2008.11.025 Lee, S., Wu, M.C., Cai, T., Li, Y., Boehnke, M., and Lin, X. (2011). Power and sample size calculations for designing rare variant sequencing association studies. In Harvard University Technical Report. (http://www.hsph.harvard.edu/∼xlin). Durbin, 2010, A map of human genome variation from population-scale sequencing, Nature, 467, 1061, 10.1038/nature09534 Schaffner, 2005, Calibrating a coalescent simulation of human genome sequence variation, Genome Res., 15, 1576, 10.1101/gr.3709305 Romeo, 2009, Rare loss-of-function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans, J. Clin. Invest., 119, 70 Duchesne, 2010, Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods, Comput. Stat. Data Anal., 54, 858, 10.1016/j.csda.2009.11.025