Links Between the Sequence Kernel Association and the Kernel-Based Adaptive Cluster Tests

Statistics in Biosciences - Tập 9 - Trang 246-258 - 2017
Weiming Zhang1, Michael P. Epstein2, Tasha E. Fingerlin3, Debashis Ghosh1
1Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, USA
2Department of Human Genetics, Emory University School of Medicine, Atlanta, USA
3Center for Genes Environment and Health, Department of Biomedical Research, National Jewish Health, Denver, USA

Tóm tắt

Two recently developed methods for the analysis of rare variants include the sequence kernel association test (SKAT) and the kernel-based adaptive cluster test (KBAC). While SKAT represents a type of variance component score test, and KBAC computes a weighted integral representing the difference in risk between variants, they appear to be developed using different initial principles. In this note, we show in fact that the KBAC can be modified to yield a test statistic with operating characteristics more similar to SKAT. Such a development relies on U- and V-statistic theory from mathematical statistics. Some simulation studies are used to evaluate the new proposed tests.

Tài liệu tham khảo

Fingerlin TE, Murphy E, Zhang W, Peljto AL, Brown KK, Steele MP, Loyd JE, Cosgrove GP, Lynch D, Groshong S, Collard HR, Wolters PJ, Bradford WZ, Kossen K, Seiwert SD, du Bois RM, Garcia CK, Devine MS, Gudmundsson G, Isaksson HJ, Kaminski N, Zhang Y, Gibson KF, Lancaster LH, Cogan JD, Mason WR, Maher TM, Molyneaux PL, Wells AU, Moffatt MF, Selman M, Pardo A, Kim DS, Crapo JD, Make BJ, Regan EA, Walek DS, Daniel JJ, Kamatani Y, Zelenika D, Smith K, McKean D, Pedersen BS, Talbert J, Kidd RN, Markin CR, Beckman KB, Lathrop M, Schwarz MI, Schwartz DA (2013) Genome-wide association study identifies multiple susceptibility loci for pulmonary fibrosis. Nat Genet 45:613–620 The Genomes Project Consortium (2015) A global reference for human genetic variation. Nature 526:68–74 Hoeffding W (1948) A class of statistics with asymptotically normal distribution. Ann Math Stat 19:293–325 Ionita-Laza I, Lee S, Makarov V, Buxbaum J, Lin X (2013) Sequence kernel association tests for the combined effect of rare and common variants. Am J Hum Genet 92:841–853 Kwee LC, Liu D, Lin X, Ghosh D, Epstein MP (2008) A powerful and flexible multilocus association test for quantitative traits. Am J Hum Genet 82:386–397 Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, NHLBI GO Exome Sequencing Project-ESP Lung Project Team, Christiani DC, Wurfel MM, Lin X (2012a) Optimal unified approach for rare variant association testing with application to small sample case-control whole-exome sequencing studies. Am J Hum Genet 91:224–237 Lee S, Teslovich T, Boehnke M, Lin X (2013) General framework for meta-analysis of rare variants in sequencing association studies. Am J Hum Genet 93:42–53 Lee S, Wu M, Lin X (2012b) Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13:762–775 Lehmann EL (1999) Elements of Large-Sample Theory. Springer, New York Li B, Leal SM (2008) Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83(3):311–321. doi:https://doi.org/10.1016/j.ajhg.2008.06.024 Li H (2012) U-statistics in genetic association studies. Hum Genet 131(9):1395–1401 Liu DJ, Leal SM (2010) A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and Interactions. PLoS Genet 6(10):e1001156. doi:https://doi.org/10.1371/journal.pgen.1001156 Liu D, Lin X, Ghosh D (2007) Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63:1079–1088 Madsen BE, Browning SR (2009) A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5(2):e1000384. doi:https://doi.org/10.1371/journal.pgen.1000384 Marchini J, Howie B (2010) Genotype imputation for genome-wide association studies. Nat Rev Genet 11:499–511 Marchini J, Howie B, Myers S, McVean G, Donnelly P (2007) A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 39:906–913 R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/ Sen PK, Singer JM (1993) Large sample methods in statistics. Chapman & Hall, Inc, New York Wei C, Li M, He Z, Vsevolozhskaya O, Schaid DJ, Lu Q (2014) A weighted U-statistic for genetic association analyses of sequencing data. Genet Epidemiol 38(8):699–708. doi:https://doi.org/10.1002/gepi.21864 Wei Z, Li M, Rebbeck T, Li H (2008) U-statistics-based tests for multiple genes in genetic association studies. Ann Hum Genet 72(Pt 6):821–833. doi:https://doi.org/10.1111/j.1469-1809.2008.00473.x Wu M, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare variant association testing for sequencing data using the sequence kernel association test (SKAT). Am J Hum Genet 89:82–93 Zhu W, Jiang Y, Zhang H (2012) Nonparametric covariate-adjusted association tests based on the generalized Kendall’s Tau. J Am Stat Assoc 107(497):1–11. doi:https://doi.org/10.1080/01621459.2011.643707