Applications of next-generation sequencing to phylogeography and phylogenetics

Molecular Phylogenetics and Evolution - Tập 66 Số 2 - Trang 526-538 - 2013
John E. McCormack1, Sarah M. Hird2,1, Amanda J. Zellmer2, Bryan C. Carstens2, Robb T. Brumfield2,1
1Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, United States
2Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, United States

Tóm tắt

Từ khóa


Tài liệu tham khảo

Albert, 2007, Direct selection of human genomic loci by microarray hybridization, Nat. Methods, 4, 903, 10.1038/nmeth1111

Alkan, 2009, Personalized copy number and segmental duplication maps using next-generation sequencing, Nat. Genet., 41, 1061, 10.1038/ng.437

Althoff, 2007, The utility of amplified fragment length polymorphisms in phylogenetics: a comparison of homology within and between genomes, Syst. Biol., 56, 477, 10.1080/10635150701427077

Altshuler, 2000, A SNP map of the human genome generated by reduced representation shotgun sequencing, Nature, 407, 513, 10.1038/35035083

Amaral, 2009, Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome, BMC Genomics, 10, 374, 10.1186/1471-2164-10-374

Andolfatto, 2011, Multiplexed shotgun genotyping for rapid and efficient genetic mapping, Genome Res., 21, 610, 10.1101/gr.115402.110

Babik, 2009, New generation sequencers as a tool for genotyping of highly polymorphic multilocus MHC system, Mol. Ecol. Resour., 9, 713, 10.1111/j.1755-0998.2009.02622.x

Baird, 2008, Rapid SNP discovery and genetic mapping using sequenced RAD markers, PLoS One, 3, e3376, 10.1371/journal.pone.0003376

Barbazuk, 2011, SNP discovery by transcriptome pyrosequencing, Methods Mol. Biol., 729, 225, 10.1007/978-1-61779-065-2_15

Bejerano, 2004, Ultraconserved elements in the human genome, Science, 304, 1321, 10.1126/science.1098119

Bers, 2010, Genome wide SNP detection in the great tit Parus major using high throughput sequencing, Mol. Ecol., 19, 89, 10.1111/j.1365-294X.2009.04486.x

Binladen, 2007, The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing, PLoS One, 2, e197, 10.1371/journal.pone.0000197

Briggs, 2009, Targeted retrieval and analysis of five Neandertal mtDNA genomes, Science, 325, 318, 10.1126/science.1174462

Brito, 2008, Multilocus phylogeography and phylogenetics using sequence-based markers, Genetica, 135, 439, 10.1007/s10709-008-9293-3

Bryant, D., Bouckaert, R., Felsenstein, J., Rosenberg, N., RoyChoudhury, A., 2012. Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. Mol. Biol. Evol. 29, 1917–1932

Buetow, 2001, High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry, Proc. Natl Acad. Sci. USA, 98, 581, 10.1073/pnas.98.2.581

Cánovas, 2010, SNP discovery in the bovine milk transcriptome using RNA-Seq technology, Mamm. Genome, 21, 592, 10.1007/s00335-010-9297-z

Catchen, 2011, Stacks: building and genotyping loci de novo from short-read sequences, G3: Genes, Genomes, Genetics, 1, 171, 10.1534/g3.111.000240

Chan, 2010, Mitochondrial genome sequences effectively reveal the phylogeny of Hylobates gibbons, PLoS One, 5, e14419, 10.1371/journal.pone.0014419

Chepelev, 2009, Detection of single nucleotide variations in expressed exons of the human genome using RNA-Seq, Nucleic Acids Res., 37, e106, 10.1093/nar/gkp507

Chevreux, B., Wetter, T., Suhai, S., 1999. Genome sequence assembly using trace signals and additional sequence information. In: Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB), vol. 99, pp. 45–56.

Cole, 2009, The ribosomal database project: improved alignments and new tools for rRNA analysis, Nucleic Acids Res., 37, D141, 10.1093/nar/gkn879

Craig, 2008, Identification of genetic variants using bar-coded multiplexed sequencing, Nat. Methods, 5, 887, 10.1038/nmeth.1251

Crawford, N.G., Faircloth, B.C., McCormack, J.E., Brumfield, R.T., Winker, K., Glenn, T.C., 2012. More than 1000 ultraconserved elements provide evidence that turtles are the sister group to archosaurs. Biol. Lett. 8, 783–786.

Cutler, 2010, To pool, or not to pool?, Genetics, 186, 41, 10.1534/genetics.110.121012

Davey, 2011, Genome-wide genetic marker discovery and genotyping using next-generation sequencing, Nat. Rev. Genet., 12, 499, 10.1038/nrg3012

Decker, 2009, Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics, Proc. Natl Acad. Sci. USA, 106, 18644, 10.1073/pnas.0904691106

DePristo, 2011, A framework for variation discovery and genotyping using next-generation DNA sequencing data, Nat. Genet., 43, 491, 10.1038/ng.806

Derti, 2006, Mammalian ultraconserved elements are strongly depleted among segmental duplications and copy number variants, Nat. Genet., 38, 1216, 10.1038/ng1888

Durand, 2011, Testing for ancient admixture between closely related populations, Mol. Biol. Evol., 28, 2239, 10.1093/molbev/msr048

Edmonson, 2011, Bambino: a variant detector and alignment viewer for next-generation sequencing data in the SAM/BAM format, Bioinformatics, 27, 865, 10.1093/bioinformatics/btr032

Edwards, 2008, PERSPECTIVE: a smörgåsbord of markers for avian ecology and evolution, Mol. Ecol., 17, 945, 10.1111/j.1365-294X.2007.03644.x

Edwards, 2009, Is a new and general theory of molecular systematics emerging?, Evolution, 63, 1, 10.1111/j.1558-5646.2008.00549.x

Edwards, 2007, High-resolution species trees without concatenation, Proc. Natl Acad. Sci. USA, 104, 5841, 10.1073/pnas.0607004104

Ekblom, 2010, Applications of next generation sequencing in molecular ecology of non-model organisms, Heredity, 107, 1, 10.1038/hdy.2010.152

Emerson, 2010, Resolving postglacial phylogeography using high-throughput sequencing, Proc. Natl Acad. Sci. USA, 107, 16196, 10.1073/pnas.1006538107

Etter, 2011, Local de novo assembly of RAD paired-end contigs using short sequencing reads, PLoS One, 6, e18561, 10.1371/journal.pone.0018561

Faircloth, B.C., Glenn, T.C., 2012. Not all sequence tags are created equal: designing and validating sequence identification tags robust to indels. PLoS One 7, e42543.

Faircloth, B.C., McCormack, J.E., Crawford, N.G., Harvey, M.G., Brumfield, R.T., Glenn, T.C., 2012. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst. Biol. 61, 717–726.

Fierer, 2008, The influence of sex, handedness, and washing on the diversity of hand surface bacteria, Proc. Natl Acad. Sci. USA, 105, 17994, 10.1073/pnas.0807920105

Geraldes, 2011, SNP discovery in black cottonwood (Populus trichocarpa) by population transcriptome resequencing, Mol. Ecol. Resour., 11, 81, 10.1111/j.1755-0998.2010.02960.x

Glenn, 2011, Field guide to next-generation DNA sequencers, Mol. Ecol. Res., 11, 759, 10.1111/j.1755-0998.2011.03024.x

Gnirke, 2009, Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing, Nat. Biotechnol., 27, 182, 10.1038/nbt.1523

Goecks, 2010, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences, Genome Biol., 11, R86, 10.1186/gb-2010-11-8-r86

Gompert, 2011, A hierarchical Bayesian model for next-generation population genomics, Genetics, 187, 903, 10.1534/genetics.110.124693

Gompert, 2010, Bayesian analysis of molecular variance in pyrosequences quantifies population genetic structure across the genome of Lycaeides butterflies, Mol. Ecol., 19, 1473

Griffin, 2011, A next-generation sequencing method for overcoming the multiple gene copy problem in polyploid phylogenetics, applied to Poa grasses, BMC Biol., 9, 19, 10.1186/1741-7007-9-19

Gunnarsdóttir, 2011, High-throughput sequencing of complete human mtDNA genomes from the Philippines, Genome Res., 21, 1, 10.1101/gr.107615.110

Hamady, 2008, Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex, Nat. Methods, 5, 235, 10.1038/nmeth.1184

Heled, 2010, Bayesian inference of species trees from multilocus data, Mol. Biol. Evol., 27, 570, 10.1093/molbev/msp274

Hellmann, 2008, Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals, Genome Res., 18, 1020, 10.1101/gr.074187.107

Hird, 2011, PRGmatic: an efficient pipeline for collating genome enriched second generation sequencing data using a ‘provisional reference genome’, Mol. Ecol. Res., 11, 743, 10.1111/j.1755-0998.2011.03005.x

Hittinger, 2010, Leveraging skewed transcript abundance by RNA-Seq to increase the genomic depth of the tree of life, Proc. Natl Acad. Sci. USA, 107, 1476, 10.1073/pnas.0910449107

Hodges, 2007, Genome-wide in situ exon capture for selective resequencing, Nat. Genet., 39, 1522, 10.1038/ng.2007.42

Hohenlohe, 2011, RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow trout and westslope cutthroat trout, Mol. Ecol. Res., 11, 117, 10.1111/j.1755-0998.2010.02967.x

Hohenlohe, 2010, Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags, PLoS Genet., 6, e1000862, 10.1371/journal.pgen.1000862

Huang, 1999, CAP3: A DNA sequence assembly program, Genome Res., 9, 868, 10.1101/gr.9.9.868

Hyten, 2010, High-throughput SNP discovery through deep resequencing of a reduced representation library to anchor and orient scaffolds in the soybean whole genome sequence, BMC Genomics, 11, 38, 10.1186/1471-2164-11-38

Hyten, 2010, High-throughput SNP discovery and assay development in common bean, BMC Genomics, 11, 475, 10.1186/1471-2164-11-475

Janes, 2011, Reptiles and mammals have differentially retained long conserved noncoding sequences from the Amniote ancestor, Genome Biol. Evol., 3, 102, 10.1093/gbe/evq087

Johnson, 2006, Inference of population genetic parameters in metagenomics: a clean look at messy data, Genome Res., 16, 1320, 10.1101/gr.5431206

Johnson, 2008, Accounting for bias from sequencing error in population genetic estimates, Mol. Biol. Evol., 25, 199, 10.1093/molbev/msm239

Kenny, 2011, Multiplex target enrichment using DNA indexing for ultra-high throughput SNP detection, DNA Res., 18, 31, 10.1093/dnares/dsq029

Kerstens, 2009, Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey, BMC Genomics, 10, 479, 10.1186/1471-2164-10-479

Kloch, 2010, Effects of an MHC DRB genotype and allele number on the load of gut parasites in the bank vole Myodes glareolus, Mol. Ecol., 19, 255, 10.1111/j.1365-294X.2009.04476.x

Knowles, 2009, Statistical phylogeography, Annu. Rev. Ecol. Evol. Syst., 40, 593, 10.1146/annurev.ecolsys.38.091206.095702

Koboldt, 2009, VarScan: variant detection in massively parallel sequencing of individual and pooled samples, Bioinformatics, 25, 3, 10.1093/bioinformatics/btp373

Kocher, 1989, Dynamics of mitochondrial DNA evolution in animals: amplification and sequencing with conserved primers, Proc. Natl Acad. Sci. USA, 86, 6196, 10.1073/pnas.86.16.6196

Kofler, 2011, PoPoolation: a toolbox for population genetic analysis of next generation sequencing data from pooled individuals, PLoS One, 6, e15925, 10.1371/journal.pone.0015925

Kozarewa, 2009, Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes, Nat. Methods, 6, 291, 10.1038/nmeth.1311

Kubatko, 2009, STEM: species tree estimation using maximum likelihood for gene trees under coalescence, Bioinformatics, 25, 971, 10.1093/bioinformatics/btp079

Kuenster, 2010, Comparative genomics based on massive parallel transcriptome sequencing reveals patterns of substitution and selection across 10 bird species, Mol. Ecol., 19, 266, 10.1111/j.1365-294X.2009.04487.x

Kuhner, 2009, Coalescent genealogy samplers: windows into population history, Trends Ecol. Evol., 24, 86, 10.1016/j.tree.2008.09.007

Kumar, 2011, CLOTU: an online pipeline for processing and clustering of 454 amplicon reads into OTUs followed by taxonomic annotation, BMC Bioinformatics, 12, 182, 10.1186/1471-2105-12-182

Langmead, 2010, Ultrafast and memory-efficient alignment of short DNA sequences to the human genome, Genome Biol., 10, R25, 10.1186/gb-2009-10-3-r25

Lerner, 2010, Prospects for the use of next-generation sequencing methods in ornithology, Auk, 127, 4, 10.1525/auk.2010.127.1.4

Li, 2009, Fast and accurate short read alignment with Burrows–Wheeler transform, Bioinformatics, 25, 7, 10.1093/bioinformatics/btp324

Li, 2008, SOAP: short oligonucleotide alignment program, Bioinformatics, 24, 713, 10.1093/bioinformatics/btn025

Li, 2009, The sequence alignment/map format and SAMtools, Bioinformatics, 25, 2078, 10.1093/bioinformatics/btp352

Lipshutz, 1999, High density synthetic oligonucleotide arrays, Nat. Genet., 21, 20, 10.1038/4447

Liu, 2009, Coalescent methods for estimating phylogenetic trees, Mol. Phyl. Evol., 53, 320, 10.1016/j.ympev.2009.05.033

Liu, 2009, Estimating species phylogenies using coalescence times among sequences, Syst. Biol., 58, 468, 10.1093/sysbio/syp031

Lunter, 2011, Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads, Genome Res., 21, 936, 10.1101/gr.111120.110

Lynch, 2009, Estimation of allele frequencies from high-coverage genome-sequencing projects, Genetics, 182, 295, 10.1534/genetics.109.100479

Mamanova, 2009, Target-enrichment strategies for next-generation sequencing, Nat. Methods, 7, 111, 10.1038/nmeth.1419

Mardis, 2008, The impact of next-generation sequencing technology on genetics, Trends Genet., 24, 133, 10.1016/j.tig.2007.12.007

Maricic, 2010, Multiplexed DNA sequence capture of mitochondrial genomes using PCR products, PLoS One, 5, e14004, 10.1371/journal.pone.0014004

Marioni, 2008, RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays, Genome Res., 18, 1509, 10.1101/gr.079558.108

McCormack, J.E., Faircloth, B.C., Crawford, N.G., Gowaty, P.A., Brumfield, R.T., Glenn, T.C., 2012. Ultraconserved elements are novel phylogenomic markers that resolve placental mammal phylogeny when combined with species-tree analysis. Genome Res. 22, 746–754.

McCormack, 2012, Next-generation sequencing reveals population genetic structure and a species tree for recent bird divergences, Mol. Phyl. Evol., 62, 397, 10.1016/j.ympev.2011.10.012

Medinger, 2011, Diversity in a hidden world: potential and limitation of next generation sequencing for surveys of molecular diversity of eukaryotic microorganisms, Mol. Ecol., 19, 32

Meyer, 2008, Parallel tagged sequencing on the 454 platform, Nat. Protoc., 3, 267, 10.1038/nprot.2007.520

Milne, 2010, Tablet – next generation sequence assembly visualization, Bioinformatics, 26, 401, 10.1093/bioinformatics/btp666

Miller, 2007, Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers, Genome. Res., 17, 240, 10.1101/gr.5681207

Morin, 2008, Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing, Biotechniques, 45, 81, 10.2144/000112900

Morin, 2010, Complete mitochondrial genome phylogeographic analysis of killer whales (Orcinus orca) indicates multiple species, Genome Res., 20, 908, 10.1101/gr.102954.109

Nabholz, 2011, Dynamic evolution of base composition: causes and consequences in avian phylogenomics, Mol. Biol. Evol., 28, 2197, 10.1093/molbev/msr047

Naduvilezhath, 2011, Jaatha: a fast composite likelihood approach to estimate demographic parameters, Mol. Ecol., 20, 2709, 10.1111/j.1365-294X.2011.05131.x

Neiman, 2011, Decoding a substantial set of samples in parallel by massive sequencing, PLoS One, 6, e17785, 10.1371/journal.pone.0017785

Ng, 2009, Targeted capture and massively parallel sequencing of 12 human exomes, Nature, 461, 272, 10.1038/nature08250

Nicod, 2003, SNPs by AFLP (SBA): a rapid SNP isolation strategy for non model organisms, Nucleic Acids Res., 31, e19, 10.1093/nar/gng019

Nielsen, 2011, Genotype and SNP calling from next-generation sequencing data, Nat. Rev. Genet., 12, 443, 10.1038/nrg2986

Novembre, 2008, Genes mirror geography within Europe, Nature, 456, 98, 10.1038/nature07331

Okou, 2007, Microarray-based genomic selection for high-throughput resequencing, Nat. Methods, 4, 907, 10.1038/nmeth1109

Oliver, 2010, Whole-genome positive selection and habitat-driven evolution in a shallow and a deep-sea urchin, Genome Biol. Evol., 2, 800, 10.1093/gbe/evq063

Parks, 2009, Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes, BMC Biol., 7, 84, 10.1186/1741-7007-7-84

Philippe, 2011, Resolving difficult phylogenetic questions: why more sequences are not enough, PLoS Biol., 9, e1000602, 10.1371/journal.pbio.1000602

Pinho, 2010, Divergence with gene flow: models and data, Annu. Rev. Ecol. Evol. Syst., 41, 215, 10.1146/annurev-ecolsys-102209-144644

Pritchard, 2000, Inference of population structure using multilocus genotype data, Genetics, 155, 945, 10.1093/genetics/155.2.945

Ramos, 2009, Design of a high density SNP genotyping assay in the pig using SNPs identified and characterized by next generation sequencing technology, PLoS One, 4, e6524, 10.1371/journal.pone.0006524

Rice, 2011, A guide to the genomics of ecological speciation in natural animal populations, Ecol. Lett., 14, 9, 10.1111/j.1461-0248.2010.01546.x

Sánchez, 2009, Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library, BMC Genomics, 10, 559, 10.1186/1471-2164-10-559

Seeb, 2009, SNP genotyping by the 5-nuclease reaction: advances in high throughput genotyping with non-model organisms, 277, 10.1007/978-1-60327-411-1_18

Shendure, 2008, Next-generation DNA sequencing, Nat. Biotechnol., 26, 1135, 10.1038/nbt1486

Siepel, 2005, Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes, Genome Res., 15, 1034, 10.1101/gr.3715005

Simpson, 2009, ABySS: a parallel assembler for short read sequence data, Genome Res., 19, 1117, 10.1101/gr.089532.108

Sims, 2009, Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions, Proc. Natl Acad. Sci. USA, 106, 2677, 10.1073/pnas.0813249106

Sirén, 2011, Reconstructing population histories from single nucleotide polymorphism data, Mol. Biol. Evol., 28, 673, 10.1093/molbev/msq236

Slater, 2005, Automated generation of heuristics for biological sequence comparison, BMC Bioinformatics, 6, 31, 10.1186/1471-2105-6-31

Smith, 2011, Multiplex preamplification PCR and microsatellite validation enables accurate single nucleotide polymorphism genotyping of historical fish scales, Mol. Ecol. Res., 11, 268, 10.1111/j.1755-0998.2010.02965.x

Stapley, 2010, Adaptation genomics: the next generation, Trends Ecol. Evol., 25, 705, 10.1016/j.tree.2010.09.002

Stephen, 2008, Large-scale appearance of ultraconserved elements in tetrapod genomes and slowdown of the molecular clock, Mol. Biol. Evol., 25, 402, 10.1093/molbev/msm268

Sunnucks, 2000, SSCP is not so difficult: the application and utility of single-stranded conformation polymorphism in evolutionary biology and molecular ecology, Mol. Ecol., 9, 1699, 10.1046/j.1365-294x.2000.01084.x

Tewhey, 2009, Enrichment of sequencing targets from the human genome by solution hybridization, Genome Biol., 10, R116, 10.1186/gb-2009-10-10-r116

Tewhey, 2009, Microdroplet-based PCR enrichment for large-scale targeted sequencing, Nat. Biotechnol., 27, 1025, 10.1038/nbt.1583

Thomson, 2010, Genome enabled development of DNA markers for ecology, evolution and conservation, Mol. Ecol., 19, 2184, 10.1111/j.1365-294X.2010.04650.x

Van Orsouw, 2007, Complexity reduction of polymorphic sequences (CRoPS™): a novel approach for large-scale polymorphism discovery in complex genomes, PLoS One, 2, e1172, 10.1371/journal.pone.0001172

Van Tassell, 2008, SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries, Nat. Methods, 5, 247, 10.1038/nmeth.1185

Vishnoi, 2010, Anchor-based whole genome phylogeny (ABWGP): a tool for inferring evolutionary relationships among closely related microorganims, PLoS One, 5, e14159, 10.1371/journal.pone.0014159

Vos, 1995, AFLP: a new technique for DNA fingerprinting, Nucleic Acids Res., 11, 4407, 10.1093/nar/23.21.4407

Wang, 1998, Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome, Science, 280, 1077, 10.1126/science.280.5366.1077

Wang, 2009, RNA-Seq: a revolutionary tool for transcriptomics, Nat. Rev. Genet., 10, 57, 10.1038/nrg2484

Wiedmann, 2008, SNP discovery in swine by reduced representation and high throughput pyrosequencing, BMC Genet., 9, 81, 10.1186/1471-2156-9-81

Williams, 2010, SNP identification, verification, and utility for population genetics in a non-model genus, BMC Genet., 11, 32, 10.1186/1471-2156-11-32

Zellmer, A.J., Hanes, M.M., Hird, S.M., Carstens, B.C., 2012. Deep phylogeographic structure and environmental differentiation in the carnivorous plant Sarracenia alata. Syst. Biol. 61, 763–777.

Zerbino, 2008, Velvet: algorithms for de novo short read assembly using de Bruijn graphs, Genome Res., 18, 821, 10.1101/gr.074492.107