A simple data-adaptive probabilistic variant calling model

Steve Hoffmann1, Peter F. Stadler2, Korbinian Strimmer3
1Junior Research Group Transcriptome Bioinformatics, University Leipzig, Härtelstraße 16-18, Leipzig, Germany
2Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Härtelstraße 16-18, Leipzig, Germany
3Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Härtelstraße 16–18, Leipzig, D-04107, Germany

Tóm tắt

Từ khóa


Tài liệu tham khảo

O’Rawe J, Jiang T, Sun G, Wu Y, Wang W, Hu J, et al.Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 2013; 5(3):28.

Xu H, DiCarlo J, Satya R, Peng Q, Wang Y. Comparison of somatic mutation calling methods in amplicon and whole exome sequence data. BMC Genomics. 2014; 15:244.

Yu X, Sun S. Comparing a few SNP calling algorithms using low-coverage sequencing data. BMC Bioinformatics. 2013; 14:274.

McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010; 20:1297–303.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Horner N, et al.The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25:2078–9.

Liu X, Han S, Wang Z, Gelernter J, Yang BZ. Variant Callers for Next-Generation Sequencing Data: A Comparison Study. PLoS ONE. 2013; 8(9):e75619+.

Pabinger S, Dander A, Fischer M, Snajder R, Sperk M, Efremova M, et al. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinformatics. 2014; 15:256–78.

Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011; 27(21):2987–93.

DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011; 43(5):491–8.

Efron B, Tibshirani R. Using specially designed exponential families for density estimation. Ann Stat. 1996; 24:2431–61.

McElroy KE, Luciani F, Thomas T. GemSIM: general, error-model based simulator of next-generation sequencing data. BMC Genomics. 2012; 13:74.

Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25:1754–60.

Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, et al.Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLOS Comp Biol. 2009; 5:e1000502.