PennCNV: An integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data

Genome Research - Tập 17 Số 11 - Trang 1665-1674 - 2007
Kai Wang1, Mingyao Li2, Dexter Hadley3,1, Rui Liu1, Joseph Glessner4, Struan F.A. Grant4, Hákon Hákonarson4, Maja Bućan1
1Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
2Department of Biostatistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
3Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
4Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA ,

Tóm tắt

Comprehensive identification and cataloging of copy number variations (CNVs) is required to provide a complete view of human genetic variation. The resolution of CNV detection in previous experimental designs has been limited to tens or hundreds of kilobases. Here we present PennCNV, a hidden Markov model (HMM) based approach, for kilobase-resolution detection of CNVs from Illumina high-density SNP genotyping data. This algorithm incorporates multiple sources of information, including total signal intensity and allelic intensity ratio at each SNP marker, the distance between neighboring SNPs, the allele frequency of SNPs, and the pedigree information where available. We applied PennCNV to genotyping data generated for 112 HapMap individuals; on average, we detected ∼27 CNVs for each individual with a median size of ∼12 kb. Excluding common rearrangements in lymphoblastoid cell lines, the fraction of CNVs in offspring not detected in parents (CNV-NDPs) was 3.3%. Our results demonstrate the feasibility of whole-genome fine-mapping of CNVs via high-density SNP genotyping.

Từ khóa


Tài liệu tham khảo

10.1016/S0165-4608(96)00301-9

10.1101/gr.GR-1871R

10.1214/aoms/1177697196

10.1038/ng2028

10.1093/nar/gkm076

10.1038/ng2042

10.1038/nm0106-75

10.1038/447161a

Feuk,, 2006, Structural variation in the human genome, Nat. Rev. Genet., 7, 85, 10.1038/nrg1767

10.1101/gr.5630906

10.1101/gr.3677206

10.1086/321292

10.1038/ng1547

Hinds,, 2006, Common deletions and SNPs are in linkage disequilibrium in the human genome, Nat. Genet., 38, 82, 10.1038/ng1695

10.1038/ng1416

10.1038/ng1307

10.1101/gr.229202. Article published online before March 2002

10.1038/ng1921

10.1093/nar/gkl928

10.1086/505653

10.1093/bioinformatics/btl089

10.1038/ng2080

McCarroll,, 2006, Common deletion polymorphisms in the human genome, Nat. Genet., 38, 86, 10.1038/ng1696

10.1101/gr.4565806

10.1371/journal.pgen.0020020

10.1101/gr.5402306

10.1038/nature05329

10.1016/0165-4608(92)90010-6

Risin,, 1993, Clonal expansion of cells with trisomy of chromosomes 12 and X in an EBV-transformed lymphoblastoid cell line and establishment of a tumorigenic monoclonal cell line (48,XX,+X,+12), Cytogenet. Cell Genet., 62, 54, 10.1159/000133445

10.1038/ng2093

10.1126/science.1098918

10.1086/431652

10.1093/hmg/ddl436

10.1002/biot.200600213

10.1038/ng1562

10.1109/TIT.1967.1054010

10.1086/510560