From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline

Current Protocols in Bioinformatics - Tập 43 Số 1 - 2013
Géraldine A. Van der Auwera1, Mauricio O. Carneiro1, Christopher Hartl1, Ryan Poplin1, Guillermo del Angel1, Ami Levy‐Moonshine1, Tadeusz Jordan1, Khalid Shakir1, David Roazen1, Joel Thibault1, Eric Banks1, Kiran Garimella2, David Green1, Stacey Gabriel1, Mark A. DePristo1
1Genome Sequencing and Analysis Group, Broad Institute, Cambridge, Massachusetts
2Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom

Tóm tắt

AbstractThis unit describes how to use BWA and the Genome Analysis Toolkit (GATK) to map genome sequencing data to a reference and produce high‐quality variant calls that can be used in downstream analyses. The complete workflow includes the core NGS data‐processing steps that are necessary to make the raw data suitable for analysis by the GATK, as well as the key methods involved in variant discovery using the GATK. Curr. Protoc. Bioinform. 43:11.10.1‐11.10.33. © 2013 by John Wiley & Sons, Inc.

Từ khóa


Tài liệu tham khảo

10.1038/nature09534

10.1038/ng.806

10.1111/j.2397-2335.1922.tb00768.x

10.1038/nature09298

10.1093/bioinformatics/btp698

10.1093/bioinformatics/btp352

10.1214/aoms/1177730491

10.1101/gr.107524.110

10.1101/gr.4565806

10.1093/nar/29.1.308

10.1007/978-0-387-98141-3