Predicting Genes in Single Genomes with AUGUSTUS

Current Protocols in Bioinformatics - Tập 65 Số 1 - 2019
Katharina J. Hoff1, Mario Stanke1
1University of Greifswald, Institute of Mathematics and Computer Science, Greifswald, Germany

Tóm tắt

AbstractAUGUSTUS is a tool for finding protein‐coding genes and their exon‐intron structure in genomic sequences. It does not necessarily require additional experimental input, as it can be applied in so‐called ab initio mode. However, extrinsic evidence from various sources such as transcriptome sequencing or the annotations of closely related genomes can be integrated in order to improve the accuracy and completeness of the annotation. AUGUSTUS can be applied to single genomes, or simultaneously to several aligned genomes. Here, we describe steps required for training AUGUSTUS for the annotation of individual genomes and the steps to do the actual structural annotation. Further, we describe the generation and integration of evidence from various sources of extrinsic evidence. © 2018 by John Wiley & Sons, Inc.

Từ khóa


Tài liệu tham khảo

10.1093/bioinformatics/btr703

Casper J., 2017, The UCSC genome browser database: 2018 update, Nucleic Acids Research, 46, D762

10.1073/pnas.0811066106

10.1002/0471250953.bi0410s05

10.1186/gb-2013-14-4-r36

10.1093/bioinformatics/bts635

Gremme G.(2013).Computational Gene Structure Prediction. PhD thesis Universität Hamburg.

10.1093/nar/gkg770

10.1093/nar/gkt418

Karolchik D., 2012, The ucsc genome browser, Current Protocols in Bioinformatics, 40, 1, 10.1002/0471250953.bi0104s40

10.1186/1471-2105-9-278

10.1101/gr.229202. Article published online before March 2002

König S., 2016, Simultaneous gene finding in multiple genomes, Bioinformatics, 32, 3388, 10.1093/bioinformatics/btw494

10.1093/nar/gku557

10.1371/journal.pcbi.1000424

Pirovano W., 2015, NCBI‐compliant genome submissions: Tips and tricks to save time and money, Briefings in Bioinformatics, 18, 179

10.1093/bioinformatics/bti1018

10.1101/gr.094607.109

10.1186/1471-2105-6-31

10.1093/bioinformatics/btn013

10.1093/nar/gkl200

10.1186/1471-2105-7-62

10.1093/nar/gkh379

10.1038/nmeth.2714

10.1093/bioinformatics/btq057

10.1093/bioinformatics/bti310