Assembly algorithms for next-generation sequencing data
Tóm tắt
Từ khóa
Tài liệu tham khảo
Sanger, 1980, Cloning in single-stranded bacteriophage as an aid to rapid DNA sequencing, J. Mol. Biol., 143, 161, 10.1016/0022-2836(80)90196-5
Staden, 1979, A strategy of DNA sequencing employing computer programs, Nucleic Acids Res., 6, 2601, 10.1093/nar/6.7.2601
Pop, 2009, Genome assembly reborn: recent computational challenges, Brief. Bioinform., 10, 354, 10.1093/bib/bbp026
Mardis, 2008, The impact of next-generation sequencing technology on genetics, Trends Genet., 24, 133, 10.1016/j.tig.2007.12.007
Morozova, 2008, Applications of next-generation sequencing technologies in functional genomics, Genomics, 92, 255, 10.1016/j.ygeno.2008.07.001
Strausberg, 2008, Emerging DNA sequencing technologies for human genomic medicine, Drug Discov. Today, 13, 569, 10.1016/j.drudis.2008.03.025
Pettersson, 2009, Generations of sequencing technologies, Genomics, 93, 105, 10.1016/j.ygeno.2008.10.003
Sanger, 1977, DNA sequencing with chain-terminating inhibitors, Proc. Natl. Acad. Sci. U. S. A., 74, 5463, 10.1073/pnas.74.12.5463
Eid, 2009, Real-time DNA sequencing from single polymerase molecules, Science, 323, 133, 10.1126/science.1162986
Ewing, 1998, Base-calling of automated sequencer traces using phred. II. Error probabilities, Genome Res., 8, 186, 10.1101/gr.8.3.175
Huse, 2007, Accuracy and quality of massively parallel DNA pyrosequencing, Genome Biol., 8, R143, 10.1186/gb-2007-8-7-r143
Dohm, 2008, Substantial biases in ultra-short read data sets from high-throughput DNA sequencing, Nucleic Acids Res., 36, e105, 10.1093/nar/gkn425
Harismendy, 2009, Evaluation of next generation sequencing platforms for population targeted sequencing studies, Genome Biol., 10, R32, 10.1186/gb-2009-10-3-r32
Fleischmann, 1995, Whole-genome random sequencing and assembly of Haemophilus influenzae Rd, Science, 269, 496, 10.1126/science.7542800
Adams, 2000, The genome sequence of Drosophila melanogaster, Science, 287, 2185, 10.1126/science.287.5461.2185
Siegel, 2000, Modeling the feasibility of whole genome shotgun sequencing using a pairwise end strategy, Genomics, 68, 237, 10.1006/geno.2000.6303
Phillippy, 2008, Genome assembly forensics: finding the elusive mis-assembly, Genome Biol., 9, R55, 10.1186/gb-2008-9-3-r55
Kececioglu, 2001, Separating repeats in DNA sequence assembly, 176
Whiteford, 2005, An analysis of the feasibility of short read sequencing, Nucleic Acids Res., 33, e171, 10.1093/nar/gni170
Rusch, 2007, The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific, PLoS Biol., 5, e77, 10.1371/journal.pbio.0050077
Mavromatis, 2007, Use of simulated data sets to evaluate the fidelity of metagenomic processing methods, Nat. Methods, 4, 495, 10.1038/nmeth1043
Wommack, 2008, Metagenomics: read length matters, Appl. Environ. Microbiol., 74, 1453, 10.1128/AEM.02181-07
Myers, 1995, Toward simplifying and accurately formulating fragment assembly, J. Comput. Biol., 2, 275, 10.1089/cmb.1995.2.275
Idury, 1995, A new algorithm for DNA sequence assembly, J. Comput. Biol., 2, 291, 10.1089/cmb.1995.2.291
Zerbino, 2008, Velvet: algorithms for de novo short read assembly using de Bruijn graphs, Genome Res., 18, 821, 10.1101/gr.074492.107
Pevzner, 2004, De novo repeat classification and fragment assembly, Genome Res., 14, 1786, 10.1101/gr.2395204
Fasulo, 2002, Efficiently detecting polymorphisms during the fragment assembly process, Bioinformatics, 18, S294, 10.1093/bioinformatics/18.suppl_1.S294
Nagarajan, 2009, Parametric complexity of sequence assembly: theory and applications to next generation sequencing, J. Comput. Biol., 16, 897, 10.1089/cmb.2009.0005
Pop, 2008, Bioinformatics challenges of new sequencing technology, Trends Genet., 24, 142, 10.1016/j.tig.2007.12.006
Warren, 2007, Assembling millions of short DNA sequences using SSAKE, Bioinformatics, 23, 500, 10.1093/bioinformatics/btl629
Warren, 2008, SSAKE 3.0: Improved speed, accuracy and contiguity
Dohm, 2007, SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing, Genome Res., 17, 1697, 10.1101/gr.6435207
Jeck, 2007, Extending assembly of short DNA sequences to handle error, Bioinformatics, 23, 2942, 10.1093/bioinformatics/btm451
Reinhardt, 2009, De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae, Genome Res., 19, 294, 10.1101/gr.083311.108
Goldberg, 2006, A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes, Proc. Natl. Acad. Sci. U. S. A., 103, 11240, 10.1073/pnas.0604351103
Myers, 2000, A whole-genome assembly of Drosophila, Science, 287, 2196, 10.1126/science.287.5461.2196
Jaffe, 2003, Whole-genome sequence assembly for mammalian genomes: Arachne 2, Genome Res., 13, 91, 10.1101/gr.828403
X. Huang, S.P. Yang, Generating a genome assembly with PCAP. Curr Protoc Bioinformatics Chapter 11 (2005) Unit11 3.
Batzoglou, 2005, Algorithmic Challenges in Mammalian Genome Sequence Assembly
Pop, 2005, DNA sequence assembly algorithms
Sutton, 2007, Shotgun Fragment Assembly, 79
Wang, 1994, On the complexity of multiple sequence alignment, J. Comput. Biol., 1, 337, 10.1089/cmb.1994.1.337
Margulies, 2005, Genome sequencing in microfabricated high-density picolitre reactors, Nature, 437, 376, 10.1038/nature03959
Miller, 2008, Aggressive assembly of pyrosequencing reads with mates, Bioinformatics, 24, 2818, 10.1093/bioinformatics/btn548
Hernandez, 2008, De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer, Genome Res., 18, 802, 10.1101/gr.072033.107
Hossain, 2009, Crystallizing short-read assemblies around seeds, BMC Bioinformatics, 10, S16, 10.1186/1471-2105-10-S1-S16
Pevzner, 1989, 1-Tuple DNA sequencing: computer analysis, J. Biomol. Struct. Dyn., 7, 63, 10.1080/07391102.1989.10507752
Pevzner, 2001, An Eulerian path approach to DNA fragment assembly, Proc. Natl. Acad. Sci. U. S. A., 98, 9748, 10.1073/pnas.171285098
Simpson, 2009, ABySS: A parallel assembler for short read sequence data, Genome Res., 19, 1117, 10.1101/gr.089532.108
Pevzner, 2001, Fragment assembly with double-barreled data, Bioinformatics, 17, S225, 10.1093/bioinformatics/17.suppl_1.S225
Chaisson, 2004, Fragment assembly with short reads, Bioinformatics, 20, 2067, 10.1093/bioinformatics/bth205
Chaisson, 2008, Short read fragment assembly of bacterial genomes, Genome Res., 18, 324, 10.1101/gr.7088808
Chaisson, 2009, De novo fragment assembly with short mate-paired reads: Does the read length matter?, Genome Res., 19, 336, 10.1101/gr.079053.108
Zerbino, 2009, Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler, PLoS One, 4, e8407, 10.1371/journal.pone.0008407
Butler, 2008, ALLPATHS: de novo assembly of whole-genome shotgun microreads, Genome Res., 18, 810, 10.1101/gr.7337908
Maccallum, 2009, ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads, Genome Biol., 10, R103, 10.1186/gb-2009-10-10-r103
Li, 2009, De novo assembly of human genomes with massively parallel short read sequencing, Genome Res., 20, 265, 10.1101/gr.097261.109
Li, 2009, The sequence and de novo assembly of the giant panda genome, Nature, 463, 311, 10.1038/nature08696
Li, 2009, Building the sequence map of the human pan-genome, Nat. Biotechnol., 28, 57, 10.1038/nbt.1596
Diguistini, 2009, De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data, Genome Biol., 10, R94, 10.1186/gb-2009-10-9-r94
Schmidt, 2009, A fast hybrid short read fragment assembly algorithm, Bioinformatics, 25, 2279, 10.1093/bioinformatics/btp374
Sundquist, 2007, Whole-genome sequencing and assembly with high-throughput, short-read technologies, PLoS ONE, 2, e484, 10.1371/journal.pone.0000484
Myers, 2005, The fragment assembly string graph, Bioinformatics, 21, ii79, 10.1093/bioinformatics/bti1114
P. Medvedev, M. Brudno, Ab initio Whole Genome Shotgun Assembly with Mated Short Reads Proceedings of the 12th Annual Research in Computational Biology Conference (RECOMB), 2008.
Li, 2008, SOAP: short oligonucleotide alignment program, Bioinformatics, 24, 713, 10.1093/bioinformatics/btn025
Li, 2009, SOAP2: an improved ultrafast tool for short read alignment, Bioinformatics, 25, 1966, 10.1093/bioinformatics/btp336
Li, 2008, Mapping short DNA sequencing reads and calling variants using mapping quality scores, Genome Res., 18, 1851, 10.1101/gr.078212.108
Langmead, 2009, Ultrafast and memory-efficient alignment of short DNA sequences to the human genome, Genome Biol., 10, R25, 10.1186/gb-2009-10-3-r25
Smith, 2008, Using quality scores and longer reads improves accuracy of Solexa read mapping, BMC Bioinformatics, 9, 128, 10.1186/1471-2105-9-128
Schatz, 2009, CloudBurst: Highly Sensitive Read Mapping with MapReduce, Bioinformatics, 25, 1363, 10.1093/bioinformatics/btp236
Rumble, 2009, SHRiMP: accurate mapping of short color-space reads, PLoS Comput. Biol., 5, e1000386, 10.1371/journal.pcbi.1000386
Weese, 2009, RazerS–fast read mapping with sensitivity control, Genome Res., 19, 1646, 10.1101/gr.088823.108
Chen, 2009, PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds, Bioinformatics, 25, 2514, 10.1093/bioinformatics/btp486
Hoffmann, 2009, Fast mapping of short sequences with mismatches, insertions and deletions using index structures, PLoS Comput. Biol., 5, e1000502, 10.1371/journal.pcbi.1000502
Schneeberger, 2009, Simultaneous alignment of short reads against multiple genomes, Genome Biol., 10, R98, 10.1186/gb-2009-10-9-r98
Zhao, 2009, BOAT: Basic Oligonucleotide Alignment Tool, BMC Genomics, 10, S2, 10.1186/1471-2164-10-S3-S2
McKernan, 2009, Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding, Genome Res., 19, 1527, 10.1101/gr.091868.109
Lee, 2009, MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions, Nat. Methods, 6, 473, 10.1038/nmeth.f.256
Hormozdiari, 2009, Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes, Genome Res., 19, 1270, 10.1101/gr.088633.108
Chen, 2009, BreakDancer: an algorithm for high-resolution mapping of genomic structural variation, Nat. Methods, 6, 677, 10.1038/nmeth.1363
Hillier, 2008, Whole-genome sequencing and variant discovery in C. elegans, Nat. Methods, 5, 183, 10.1038/nmeth.1179