BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis

PLoS Computational Biology - Tập 15 Số 4 - Trang e1006650
Remco Bouckaert1,2, Timothy G. Vaughan3,4, Joëlle Barido‐Sottani3,4, Sebastián Duchêne5, Mathieu Fourment6, Alexandra Gavryushkina7, Joseph Heled8, Graham Jones9, Denise Kühnert2, Nicola De Maio10, Michael Matschiner11, Fábio K. Mendes1, Nicola F. Müller3,4, Huw A. Ogilvie12, Louis du Plessis13, Alex Popinga1, Andrew Rambaut14, David A. Rasmussen15, Igor Siveroni16, Marc A. Suchard17, Chieh‐Hsi Wu18, Dong Xie1, Chi Zhang19, Tanja Stadler3,4, Alexei J. Drummond1
1Centre of Computational Evolution, University of Auckland, Auckland, New Zealand
2Max Planck Institute for the Science of Human History, Jena, Germany
3ETH Zürich, Department of Biosystems Science and Engineering, 4058 Basel, Switzerland
4Swiss Institute of Bioinformatics, Lausanne, Switzerland
5Department of Biochemistry and Molecular Biology, University of Melbourne, Melbourne, Victoria, Australia
6ithree institute, University of Technology Sydney, Sydney, Australia
7Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand
8Independent Researcher, Auckland, New Zealand
9Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE 405 30, Göteborg, Sweden
10European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridgeshire, UK
11Department of Environmental Sciences, University of Basel, 4051 Basel, Switzerland
12Department of Computer Science, Rice University, Houston, TX 77005-1892, USA
13Department of Zoology, University of Oxford, Oxford OX1 3PS, UK
14Institute of Evolutionary Biology, University of Edinburgh, Ashworth Laboratories, Edinburgh, EH9 3FL UK
15Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27695, USA
16Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place, W2 1PG, UK
17Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
18Department of Statistics, University of Oxford, OX1 3LB, UK
19Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

R Bouckaert, 2014, BEAST 2: a software platform for Bayesian evolutionary analysis, PLoS computational biology, 10, e1003537, 10.1371/journal.pcbi.1003537

AJ Drummond, 2015, Bayesian evolutionary analysis with BEAST, 10.1017/CBO9781139095112

Bouckaert R, Heled J. DensiTree 2: Seeing trees through the forest. bioRxiv. 2014; p. 012401.

TG Vaughan, 2013, A stochastic simulator of birth–death master equations with application to phylodynamics, Molecular biology and evolution, 30, 1480, 10.1093/molbev/mst057

TG Vaughan, 2014, Efficient Bayesian inference under the structured coalescent, Bioinformatics, 30, 2272, 10.1093/bioinformatics/btu201

R Bouckaert, 2013, Evolutionary rates and HBV: issues of rate estimation with Bayesian molecular methods, Antivir Ther, 18, 497, 10.3851/IMP2656

D Bryant, 2012, Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis, Molecular biology and evolution, 29, 1917, 10.1093/molbev/mss086

CH Wu, 2012, Bayesian selection of nucleotide substitution models and their site assignments, Molecular biology and evolution, 30, 669

T Stadler, 2013, Birth–death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C virus (HCV), Proceedings of the National Academy of Sciences, 110, 228, 10.1073/pnas.1207965110

CE Hinchliff, 2015, Synthesis of phylogeny and taxonomy into a comprehensive tree of life, Proc Natl Acad Sci U S A, 112, 12764, 10.1073/pnas.1423041112

N De Maio, 2015, PoMo: an allele frequency-based approach for species tree estimation, Systematic biology, 64, 1018, 10.1093/sysbio/syv048

N De Maio, 2018, Bayesian reconstruction of transmission within outbreaks using genomic variants, PLoS computational biology, 14, e1006117, 10.1371/journal.pcbi.1006117

A Gavryushkina, 2014, Bayesian inference of sampled ancestor trees for epidemiology and fossil calibration, PLoS computational biology, 10, e1003919, 10.1371/journal.pcbi.1003919

TG Vaughan, 2017, Inferring ancestral recombination graphs from bacterial genomic data, Genetics, 205, 857, 10.1534/genetics.116.193425

C Zhang, 2017, Bayesian inference of species networks from multilocus sequence data, Molecular biology and evolution

RR Bouckaert, 2017, bModelTest: Bayesian phylogenetic site model averaging and model comparison, BMC evolutionary biology, 17, 42, 10.1186/s12862-017-0890-6

N Goldman, 1994, A codon-based model of nucleotide substitution for protein-coding DNA sequences, Molecular biology and evolution, 11, 725

Z Yang, 2000, Codon-substitution models for heterogeneous selection pressure at amino acid sites, Genetics, 155, 431, 10.1093/genetics/155.1.431

PO Lewis, 2001, A likelihood approach to estimating phylogeny from discrete morphological character data, Systematic biology, 50, 913, 10.1080/106351501753462876

R Sainudiin, 2004, Microsatellite mutation models: insights from a comparison of humans and chimpanzees, Genetics, 168, 383, 10.1534/genetics.103.022665

CH Wu, 2011, Joint inference of microsatellite mutation models, population history and genealogies using transdimensional Markov Chain Monte Carlo, Genetics, 188, 151, 10.1534/genetics.110.125260

N De Maio, 2013, Linking great apes genome evolution across time scales using polymorphism-aware phylogenetic models, Molecular biology and evolution, 30, 2249, 10.1093/molbev/mst131

Bouckaert R, Lockhart P. Capturing heterotachy through multi-gamma site models. bioRxiv. 2015; p. 018101.

M Fourment, 2018, Local and relaxed clocks: the best of both worlds, PeerJ, 6, e5140, 10.7717/peerj.5140

M Matschiner, 2017, Bayesian phylogenetic estimation of clade ages supports trans-Atlantic dispersal of cichlid fishes, Systematic biology, 66, 3

T Stadler, 2009, On incomplete sampling under birth–death models and connections to the sampling-based coalescent, Journal of theoretical biology, 261, 58, 10.1016/j.jtbi.2009.07.018

A Popinga, 2014, Inferring epidemiological dynamics with Bayesian coalescent inference: the merits of deterministic and stochastic models, Genetics

D Kühnert, 2014, Simultaneous reconstruction of evolutionary history and epidemiological dynamics from viral sequences with the birth–death SIR model, Journal of the Royal Society Interface, 11, 20131106, 10.1098/rsif.2013.1106

Vaughan TG, Leventhal GE, Rasmussen DA, Drummond AJ, Welch D, Stadler T. Directly estimating epidemic curves from genomic data. bioRxiv. 2017; p. 142570.

EM Volz, 2012, Complex population dynamics and the coalescent under neutrality, Genetics, 190, 187, 10.1534/genetics.111.134627

D Kühnert, 2016, Phylodynamics with migration: a computational framework to quantify population structure from genomic data, Molecular biology and evolution, 33, 2102, 10.1093/molbev/msw064

N De Maio, 2015, New routes to phylogeography: a Bayesian structured coalescent approximation, PLoS genetics, 11, e1005421, 10.1371/journal.pgen.1005421

NF Müller, 2018, MASCOT: parameter and state inference under the marginal structured coalescent approximation, Bioinformatics

Müller NF, Dudas G, Stadler T. Inferring time-dependent migration and coalescence patterns from genetic sequence and predictor data in structured populations. bioRxiv. 2018; p. 342329.

N De Maio, 2016, SCOTTI: efficient reconstruction of transmission within outbreaks with the structured coalescent, PLoS computational biology, 12, e1005130, 10.1371/journal.pcbi.1005130

RR Bouckaert, 2018, The origin and expansion of Pama–Nyungan languages across Australia, Nature ecology & evolution, 1

R Bouckaert, 2016, Phylogeography by diffusion on a sphere: whole world phylogeography, PeerJ, 4, e2406, 10.7717/peerj.2406

Mendes FK, Bouckaert R, Drummond AJ. SSE, v.1.0.0. Zenodo. 2018;.

X Didelot, 2010, Inference of homologous recombination in bacteria using whole genome sequences, Genetics

GR Jones, 2018, Divergence Estimation in the Presence of Incomplete Lineage Sorting and Migration, Systematic Biology

G Jones, 2017, Algorithmic improvements to species delimitation and phylogeny estimation under the multispecies coalescent, Journal of mathematical biology, 74, 447, 10.1007/s00285-016-1034-0

HA Ogilvie, 2016, Computational performance and statistical accuracy of *BEAST and comparisons with other methods, Systematic biology, 65, 381, 10.1093/sysbio/syv118

HA Ogilvie, 2017, StarBEAST2 brings faster species tree inference and accurate estimates of substitution rates, Molecular biology and evolution, 34, 2101, 10.1093/molbev/msx126

Ogilvie HA, Vaughan TG, Matzke NJ, Slater GJ, Stadler T, Welch D, et al. Inferring Species Trees Using Integrative Models of Species Evolution. bioRxiv. 2018;.

Müller NF, Ogilvie H, Zhang C, Drummond A, Stadler T. Inference of species histories in the presence of gene flow. bioRxiv. 2018; p. 348391.

W Xie, 2010, Improving marginal likelihood estimation for Bayesian phylogenetic model selection, Systematic biology, 60, 150, 10.1093/sysbio/syq085

R P Maturana, 2018, Model selection and parameter inference in phylogenetics using Nested Sampling, Syst Biol

Bradley S. Synthetic Language Generation and Model Validation in BEAST2. arXiv preprint arXiv:160707931. 2016;.

S Duchene, Phylodynamic model adequacy using posterior predictive simulations, Systematic Biology

Z Yang, 2003, Comparison of likelihood and Bayesian methods for estimating divergence times using multiple gene loci and calibration points, with application to a radiation of cute-looking mouse lemur species, Systematic biology, 52, 705, 10.1080/10635150390235557

Bouckaert R, Robbeets M. Pseudo Dollo models for the evolution of binary characters along a tree. bioRxiv. 2017; p. 207571.

Z Yang, 2006, Computational molecular evolution, 10.1093/acprof:oso/9780198567028.001.0001

AJ Drummond, 2006, Relaxed phylogenetics and dating with confidence, PLoS biology, 4, e88, 10.1371/journal.pbio.0040088

AJ Drummond, 2010, Bayesian random local clocks, or one rate to rule them all, BMC biology, 8, 114, 10.1186/1741-7007-8-114

G Udny Yule, 1924, A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FRS, Philosophical Transactions of the Royal Society of London Series B, 213, 21, 10.1098/rstb.1925.0002

DG Kendall, 1949, Stochastic processes and population growth, Journal of the Royal Statistical Society Series B (Methodological), 11, 230, 10.1111/j.2517-6161.1949.tb00032.x

T Stadler, 2010, Sampling-through-time in birth-death trees, Journal of Theoretical Biology, 267, 396, 10.1016/j.jtbi.2010.09.010

JFC Kingman, 1982, The coalescent, Stochastic Processes and their Applications, 13, 235, 10.1016/0304-4149(82)90011-4

RC Griffiths, 1994, Sampling theory for neutral alleles in a varying environment, Philosophical Transactions of the Royal Society B: Biological Sciences, 344, 403, 10.1098/rstb.1994.0079

A Drummond, 2002, Estimating Mutation Parameters, Population History and Genealogy Simultaneously From Temporally Spaced Sequence Data, Genetics, 161, 1307, 10.1093/genetics/161.3.1307

AJ Drummond, 2007, BEAST: Bayesian evolutionary analysis by sampling trees, BMC evolutionary biology, 7, 214, 10.1186/1471-2148-7-214

T Stadler, 2012, Estimating the basic reproductive number from viral sequence data, Mol Biol Evol, 29, 347, 10.1093/molbev/msr217

CJM Whitty, 2014, Infectious disease: Tough choices to reduce Ebola transmission, Nature, 515, 192, 10.1038/515192a

2016, After Ebola in West Africa — Unpredictable Risks, Preventable Epidemics, New England Journal of Medicine, 375, 587, 10.1056/NEJMsr1513109

2015, West African Ebola Epidemic after One Year Slowing but Not Yet under Control, New England Journal of Medicine, 372, 584, 10.1056/NEJMc1414992

W Kermack, 1927, A contribution to the mathematical theory of epidemics, Proc Roy Soc A, 700, 10.1098/rspa.1927.0118

S Nee, 1992, Tempo and mode of evolution revealed from molecular phylogenies, Proceedings of the National Academy of Sciences, 89, 8322, 10.1073/pnas.89.17.8322

TA Heath, 2014, The fossilized birth–death process for coherent calibration of divergence-time estimates, Proceedings of the National Academy of Sciences, 111, E2957, 10.1073/pnas.1319091111

C Zhang, 2015, Total-evidence dating under the fossilized birth–death process, Systematic biology, 65, 228, 10.1093/sysbio/syv080

A Gavryushkina, 2017, Bayesian total-evidence dating reveals the recent crown radiation of penguins, Systematic biology, 66, 57

RA Pyron, 2011, Divergence Time Estimation Using Fossils as Terminal Taxa and the Origins of Lissamphibia, Systematic Biology, 60, 466, 10.1093/sysbio/syr047

F Ronquist, 2012, A total-evidence approach to dating with fossils, applied to the early radiation of the Hymenoptera, Systematic Biology, 61, 973, 10.1093/sysbio/sys058

J Heled, 2011, Calibrated tree priors for relaxed phylogenetics and divergence time estimation, Systematic Biology, 61, 138, 10.1093/sysbio/syr087

Matzke NJ, Wright A. Ground truthing tip-dating methods using fossil Canidae reveals major differences in performance. bioRxiv. 2016; p. 049643.

WP Maddison, 2007, Estimating a binary character’s effect on speciation and extinction, Systematic biology, 56, 701, 10.1080/10635150701607033

NF Müller, 2017, The Structured Coalescent and Its Approximations, Molecular biology and evolution, 34, 2970, 10.1093/molbev/msx186

E Volz, 2018, Bayesian phylodynamic inference with complex models, PLOS Computational Biology

P Beerli, 2001, Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach, Proceedings of the National Academy of Sciences, 98, 4563, 10.1073/pnas.081068098

P Lemey, 2009, Bayesian phylogeography finds its roots, PLoS computational biology, 5, e1000520, 10.1371/journal.pcbi.1000520

P Lemey, 2010, Phylogeography takes a relaxed random walk in continuous space and time, Molecular biology and evolution, 27, 1877, 10.1093/molbev/msq067

P Lemey, 2014, Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza H3N2, PLoS pathogens, 10, e1003932, 10.1371/journal.ppat.1003932

JH Degnan, 2009, Gene tree discordance, phylogenetic inference and the multispecies coalescent, Trends in Ecology & Evolution, 24, 332, 10.1016/j.tree.2009.01.009

JH Degnan, 2006, Discordance of Species Trees with Their Most Likely Gene Trees, PLOS Genetics, 2, 1, 10.1371/journal.pgen.0020068

S Roch, 2015, Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent, Theoretical Population Biology, 100, 56, 10.1016/j.tpb.2014.12.005

FK Mendes, 2018, Why Concatenation Fails Near the Anomaly Zone, Systematic Biology, 67, 158, 10.1093/sysbio/syx063

FK Mendes, 2016, Gene Tree Discordance Causes Apparent Substitution Rate Variation, Systematic Biology, 65, 711, 10.1093/sysbio/syw018

R Nichols, 2001, Gene trees and species trees are not the same, Trends in Ecology & Evolution, 16, 358, 10.1016/S0169-5347(01)02203-0

J Heled, 2010, Bayesian Inference of Species Trees from Multilocus Data, Molecular Biology and Evolution, 27, 570, 10.1093/molbev/msp274

G Jones, 2015, DISSECT: an assignment-free Bayesian discovery method for species delimitation under the multispecies coalescent, Bioinformatics, 31, 991, 10.1093/bioinformatics/btu770

S Vitecek, 2017, Integrative taxonomy by molecular species delimitation: multi-locus data corroborate a new species of Balkan Drusinae micro-endemics, BMC Evolutionary Biology, 17, 129, 10.1186/s12862-017-0972-5

G Singh, 2017, Fungal–algal association patterns in lichen symbiosis linked to macroclimate, New Phytologist, 214, 317, 10.1111/nph.14366

PAP Moran, 1958, Random processes in genetics, Mathematical Proceedings of the Cambridge Philosophical Society, 54, 60, 10.1017/S0305004100033193

Y Wang, 2008, Bayesian inference of fine-scale recombination rates using population genomic data, Philos Trans R Soc Lond B Biol Sci, 363, 3921, 10.1098/rstb.2008.0172

EW Bloomquist, 2010, Unifying vertical and nonvertical evolution: a stochastic ARG-based framework, Syst Biol, 59, 27, 10.1093/sysbio/syp076

BS Meyer, 2017, Disentangling incomplete lineage sorting and introgression to refine species-tree estimates for Lake Tanganyika cichlid fishes, Systematic Biology, 66, 531

H Li, 2011, Inference of human population history from individual whole-genome sequences, Nature, 475, 493, 10.1038/nature10231

Barroso GV, Puzovic N, Dutheil J. Inference of recombination maps from a single pair of genomes and its application to archaic samples. bioRxiv. 2018;.

AR Francis, 2015, Which Phylogenetic Networks are Merely Trees with Additional Arcs?, Systematic Biology, 64, 768, 10.1093/sysbio/syv037

Y Yu, 2012, The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection, PLOS Genetics, 8, 1

D Wen, 2016, Bayesian Inference of Reticulate Phylogenies under the Multispecies Network Coalescent, PLOS Genetics, 12, 1

R Nielsen, 2001, Distinguishing migration from isolation: a Markov chain Monte Carlo approach, Genetics, 158, 885, 10.1093/genetics/158.2.885

AD Leaché, 2014, The influence of gene flow on species tree estimation: a simulation study, Syst Biol, 63, 17, 10.1093/sysbio/syt049

A Konings, 2015, Tanganyika Cichlids in Their Natural Habitat

W Salzburger, 2002, Speciation via introgressive hybridization in East African cichlids?, Molecular Ecology, 11, 619, 10.1046/j.0962-1083.2001.01438.x

D Brawand, 2014, The genomic substrate for adaptive radiation in African cichlid fish, Nature, 513, 375, 10.1038/nature13726

HF Gante, 2016, Genomics of speciation and introgression in Princess cichlid fishes from Lake Tanganyika, Molecular Ecology, 25, 6143, 10.1111/mec.13767

M Malmstrøm, 2016, Evolution of the immune system influences speciation rates in teleost fishes, Nat Genet, 48, 1204, 10.1038/ng.3645

PO Lewis, 2013, Posterior predictive Bayesian phylogenetic model selection, Systematic biology, 63, 309, 10.1093/sysbio/syt068

Y Fan, 2010, Choosing among partition models in Bayesian phylogenetics, Molecular biology and evolution, 28, 523, 10.1093/molbev/msq224

J Skilling, 2006, Nested sampling for general Bayesian computation, Bayesian analysis, 1, 833, 10.1214/06-BA127

JP Bollback, 2002, Bayesian model adequacy and choice in phylogenetics, Molecular Biology and Evolution, 19, 1171, 10.1093/oxfordjournals.molbev.a004175

A Gelman, 2013, Bayesian data analysis, 10.1201/b16018

S Höhna, 2017, P3: Phylogenetic posterior prediction in RevBayes, Molecular biology and evolution, 35, 1028, 10.1093/molbev/msx286

SJ Greenhill, 2008, The Austronesian basic vocabulary database: from bioinformatics to lexomics, Evolutionary bioinformatics online, 4, 271

R Bouckaert, 2012, Mapping the origins and expansion of the Indo-European language family, Science, 337, 957, 10.1126/science.1219669

J Barido-Sottani, 2017, Taming the BEAST—A Community Teaching Material Resource for BEAST 2, Systematic biology, 67, 170, 10.1093/sysbio/syx060

MA Suchard, 2018, Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10, Virus Evol, 4, vey016, 10.1093/ve/vey016

S Höhna, 2016, RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language, Syst Biol, 65, 726, 10.1093/sysbio/syw021