Relating CNVs to transcriptome data at fine resolution: Assessment of the effect of variant size, type, and overlap with functional regions

Genome Research - Tập 21 Số 12 - Trang 2004-2013 - 2011
Andreas Schlattl1, Simon Anders1, Sebastian M. Waszak1, Wolfgang Huber2,1, Jan O. Korbel2,1
1European Molecular Biology Laboratory (EMBL), Genome Biology Research Unit, 69117 Heidelberg, Germany.
2European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom

Tóm tắt

Copy-number variants (CNVs) form an abundant class of genetic variation with a presumed widespread impact on individual traits. While recent advances, such as the population-scale sequencing of human genomes, facilitated the fine-scale mapping of CNVs, the phenotypic impact of most of these CNVs remains unclear. By relating copy-number genotypes to transcriptome sequencing data, we have evaluated the impact of CNVs, mapped at fine scale, on gene expression. Based on data from 129 individuals with ancestry from two populations, we identified CNVs associated with the expression of 110 genes, with 13% of the associations involving complex, multiallelic CNVs. Categorization of CNVs according to variant type, size, and gene overlap enabled us to examine the impact of different CNV classes on expression variation. While many small (<4 kb) CNVs were associated with expression variation, overall we observed an enrichment of large duplications and deletions, including large intergenic CNVs, relative to the entire set of expression-associated CNVs. Furthermore, the copy number of genes intersecting with CNVs typically correlated positively with the genes' expression, and also was more strongly correlated with expression than nearby single nucleotide polymorphisms, suggesting a frequent causal role of CNVs in expression quantitative trait loci (eQTLs). We also elucidated unexpected cases of negative correlations between copy number and expression by assessing the CNVs' effects on the structure and regulation of genes. Finally, we examined dosage compensation of transcript levels. Our results suggest that association studies can gain in resolution and power by including fine-scale CNV information, such as those obtained from population-scale sequencing.

Từ khóa


Tài liệu tham khảo

10.1038/nature09534

10.1086/520000

10.1086/605644

10.1186/gb-2010-11-10-r106

1995, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J R Stat Soc Ser B Methodol, 57, 289

10.1101/gr.112748.110

10.1038/nature08516

10.1038/nature08979

10.1371/journal.pbio.1000318

10.1126/science.1174148

10.1038/ng2046

10.1016/j.tig.2008.06.001

10.1126/science.1101160

10.1093/hmg/ddp011

10.1038/ng.2007.48

10.1038/ng1416

10.1038/nature04226

10.1126/science.1183621

10.1038/nature06862

10.1016/j.cell.2010.10.027

10.1038/nmeth.1451

10.1126/science.1149504

10.1016/j.sbi.2008.02.005

10.1038/ng0809-862

10.1002/dmrr.705

10.1371/journal.pgen.0010049

10.1038/ng1696

10.1038/ng.238

10.1038/nature09708

10.1038/nature08903

10.1038/ng.2007.16

10.1186/gb-2010-11-5-r52

10.1038/ng.555

10.1038/nature08872

10.1038/nature05329

10.1371/journal.pbio.1000543

10.1371/journal.pbio.0060107

10.1126/science.1098918

10.1371/journal.pcbi.1000770

10.1126/science.1136678

10.1126/science.1197005

10.1038/ng1562

10.1371/journal.pgen.1000214

10.1371/journal.pcbi.1000988

10.1093/bioinformatics/btq057

10.1093/glycob/cwp052

10.1371/journal.pone.0010693

10.1371/journal.pbio.1000320