New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0

Systematic Biology - Tập 59 Số 3 - Trang 307-321 - 2010
Stéphane Guindon1,2, Jean-François Dufayard2, Vincent Lefort2, Maria Anisimova3,2,4, Wim Hordijk5,2, Olivier Gascuel2
1Department of Statistics, University of Auckland, Auckland 1142, New Zealand
2Méthodes et Algorithmes pour la Bioinformatique, LIRMM, CNRS, Université de Montpellier, 34392 Montpellier Cedex 5, France
3Institute of Computational Science, ETH, CH-8092 Zurich, Switzerland
4Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
5Department of Statistics, University of Oxford, OX1 3TG Oxford, UK

Tóm tắt

Từ khóa


Tài liệu tham khảo

Anisimova, 2001, Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution, Mol. Biol. Evol., 18, 1585, 10.1093/oxfordjournals.molbev.a003945

Anisimova, 2006, Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative, Syst. Biol., 55, 539, 10.1080/10635150600755453

Felsenstein, 1985, Confidence limits on phylogenies: an approach using the bootstrap, Evolution, 39, 783, 10.1111/j.1558-5646.1985.tb00420.x

Felsenstein, 1988, Phylogenies from molecular sequences: inference and reliability, Annu. Rev. Genet., 22, 521, 10.1146/annurev.ge.22.120188.002513

Felsenstein, 2003, Inferring phylogenies

Gascuel, 1997, BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data, Mol. Biol. Evol., 14, 685, 10.1093/oxfordjournals.molbev.a025808

Goldman, 2000, Likelihood-based tests of topologies in phylogenetics, Syst. Biol., 49, 652, 10.1080/106351500750049752

Guindon, 2003, A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood, Syst. Biol., 52, 696, 10.1080/10635150390235520

Hordijk, 2005, Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood, Bioinformatics, 21, 4338, 10.1093/bioinformatics/bti713

Jobb, 2004, TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics, BMC Evol. Biol., 4, 18, 10.1186/1471-2148-4-18

Jukes, 1969, Evolution of protein molecules, Mammalian protein metabolism, 21, 10.1016/B978-1-4832-3211-9.50009-7

Kishino, 1989, Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea, J. Mol. Evol., 29, 170, 10.1007/BF02100115

Kuhner, 1994, A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates, Mol. Biol. Evol., 11, 459

Lanave, 1984, A new method for calculating evolutionary substitution rates, J. Mol. Evol., 20, 86, 10.1007/BF02101990

Le, 2008, An improved general amino acid replacement matrix, Mol. Biol. Evol., 25, 1307, 10.1093/molbev/msn067

Le, 2010, Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial, Syst. Biol., 59, 277, 10.1093/sysbio/syq002

Lemmon, 2002, The metapopulation genetic algorithm: an efficient solution for the problem of large phylogeny estimation, Proc. Natl. Acad. Sci. USA, 99, 10516, 10.1073/pnas.162224399

Olsen, 1994, fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood, Comput. Appl. Biosci, 10, 41

Ota, 2000, Appropriate likelihood ratio tests and marginal distributions for evolutionary tree models with constraints on parameters, Mol. Biol. Evol., 17, 798, 10.1093/oxfordjournals.molbev.a026358

Pagel, 2005, Mixture models in phylogenetic inference, Mathematics of evolution & phylogeny, 121, 10.1093/oso/9780198566106.003.0005

Penny, 2000, Parsimony, likelihood, and the role of models in molecular phylogenetics, Mol. Biol. Evol., 17, 839, 10.1093/oxfordjournals.molbev.a026364

Price, 2009, FastTree 2.1—approximately maximum-likelihood trees for large alignments

Ranwez, 2001, Quartet-based phylogenetic inference: improvements and limits, Mol. Biol. Evol., 18, 1103, 10.1093/oxfordjournals.molbev.a003881

Ronquist, 2003, MrBayes 3: Bayesian phylogenetic inference under mixed models, Bioinformatics, 19, 1572, 10.1093/bioinformatics/btg180

Sanderson, 1994, TreeBASE: a prototype database of phylogenetic analyses and an interactive tool for browsing the phylogeny of life, Am. J. Bot, 81, 183

Shimodaira, 1999, Multiple comparisons of log-likelihoods with applications to phylogenetic inference, Mol. Biol. Evol., 16, 1114, 10.1093/oxfordjournals.molbev.a026201

Stamatakis, 2006, RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models, Bioinformatics, 22, 2688, 10.1093/bioinformatics/btl446

Stamatakis, 2006, Phylogenetic models of rate heterogeneity: a high performance computing perspective. Proceedings of the 20th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS 2006)

Stamatakis, 2008, A rapid bootstrap algorithm for the RAxML Web servers, Syst. Biol., 57, 758, 10.1080/10635150802429642

Steel, 2007, The Bayesian ”star paradox” persists for long finite sequences, Mol. Biol. Evol., 24, 1075, 10.1093/molbev/msm028

Susko, 2008, On the distributions of bootstrap support and posterior distributions for a star tree, Syst. Biol., 57, 602, 10.1080/10635150802302468

Vinh, 2004, IQPNNI: moving fast through tree space and stopping in time, Mol. Biol. Evol., 21, 1565, 10.1093/molbev/msh176

Whelan, 2007, New approaches to phylogenetic tree search and their application to large numbers of protein alignments, Syst. Biol., 56, 727, 10.1080/10635150701611134

Whelan, 2001, A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach, Mol. Biol. Evol., 18, 691, 10.1093/oxfordjournals.molbev.a003851

Yang, 1993, Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites, Mol. Biol. Evol., 10, 1396

Zmasek, 2001, ATV: display and manipulation of annotated phylogenetic trees, Bioinformatics, 174, 383, 10.1093/bioinformatics/17.4.383

Zwickl, 2006, Genetic algorithm approaches for the phylogenetic analysis of large biological sequence data sets under the maximum likelihood criterion [PhD dissertation]