RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes

Genomics, Proteomics & Bioinformatics - Tập 16 - Trang 373-381 - 2018
Wanfei Liu1,2,3, Shuangyang Wu1,4, Qiang Lin1,2, Shenghan Gao1, Feng Ding5, Xiaowei Zhang1, Hasan Awad Aljohi2, Jun Yu1,2, Songnian Hu1,2
1CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
2Joint Center for Genomics Research (JCGR), King Abdulaziz City for Science and Technology and Chinese Academy of Sciences, Riyadh 11442, Saudi Arabia
3Grail Scientific Co. Ltd., Shenyang 110000, China
4University of Chinese Academy of Sciences, Beijing, 100049, China
5Shenzhen Institute of Geriatrics, Shenzhen, 518020, China

Tài liệu tham khảo

NCBI Resource Coordinators, 2018, Database resources of the national center for biotechnology information, Nucleic Acids Res, 46, D8, 10.1093/nar/gkx1095 Lander, 2001, Initial sequencing and analysis of the human genome, Nature, 409, 860, 10.1038/35057062 Shendure, 2008, Next-generation DNA sequencing, Nat Biotechnol, 26, 1135, 10.1038/nbt1486 Al-Mssallem, 2013, Genome sequence of the date palm Phoenix dactylifera L, Nat Commun, 4, 2274, 10.1038/ncomms3274 Xu, 2014, Genome sequence and genetic diversity of the common carp, Cyprinus carpio, Nat Genet, 46, 1212, 10.1038/ng.3098 Li, 2009, The sequence alignment/map format and SAMtools, Bioinformatics, 25, 2078, 10.1093/bioinformatics/btp352 Li, 2011, A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data, Bioinformatics, 27, 2987, 10.1093/bioinformatics/btr509 Li, 2011, Improving SNP discovery by base alignment quality, Bioinformatics, 27, 1157, 10.1093/bioinformatics/btr076 McKenna, 2010, The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data, Genome Res, 20, 1297, 10.1101/gr.107524.110 Otto, 2011, RATT: rapid annotation transfer tool, Nucleic Acids Res, 39, e57, 10.1093/nar/gkq1268 Otto, 2010, Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology, Bioinformatics, 26, 1704, 10.1093/bioinformatics/btq269 Kent, 2002, The human genome browser at UCSC, Genome Res, 12, 996, 10.1101/gr.229102 Goecks, 2010, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences, Genome Biol, 11, R86, 10.1186/gb-2010-11-8-r86 Blankenberg, 2010, Galaxy: a web-based genome analysis tool for experimentalists, Curr Protoc Mol Biol, 19, 19.10.1 Giardine, 2005, Galaxy: a platform for interactive large-scale genome analysis, Genome Res, 15, 1451, 10.1101/gr.4086505 Benson, 1999, Tandem repeats finder: a program to analyze DNA sequences, Nucleic Acids Res, 27, 573, 10.1093/nar/27.2.573 Wang, 2017, GSA: genome sequence archive, Genomics Proteomics Bioinformatics, 15, 14, 10.1016/j.gpb.2017.01.001 Bolger, 2014, Trimmomatic: a flexible trimmer for Illumina sequence data, Bioinformatics, 30, 2114, 10.1093/bioinformatics/btu170 Langmead, 2012, Fast gapped-read alignment with Bowtie 2, Nat Methods, 9, 357, 10.1038/nmeth.1923 Garrison, 2012, Haplotype-based variant detection from short-read sequencing, arXiv Kent, 2002, BLAT—the BLAST-like alignment tool, Genome Res, 12, 656, 10.1101/gr.229202 Ling, 2014, VCGDB: a dynamic genome database of the Chinese population, BMC Genomics, 15, 1, 10.1186/1471-2164-15-265 Ruffier, 2017, Ensembl core software resources: storage and programmatic access for DNA sequence and genome annotation, Database (Oxford), 2017, bax020, 10.1093/database/bax020 Haas, 2003, Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies, Nucleic Acids Res, 31, 5654, 10.1093/nar/gkg770 Holt, 2011, MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects, Bioinformatics, 12, 1 Delcher, 2003, Using MUMmer to identify similar regions in large sequence sets, Curr Protoc Bioinformatics, 10, 3 Wang, 2010, ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data, Nucleic Acids Res, 38, e164, 10.1093/nar/gkq603