RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models
Tóm tắt
Summary: RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML). Low-level technical optimizations, a modification of the search algorithm, and the use of the GTR+CAT approximation as replacement for GTR+Γ yield a program that is between 2.7 and 52 times faster than the previous version of RAxML. A large-scale performance comparison with GARLI, PHYML, IQPNNI and MrBayes on real data containing 1000 up to 6722 taxa shows that RAxML requires at least 5.6 times less main memory and yields better trees in similar times than the best competing program (GARLI) on datasets up to 2500 taxa. On datasets ≥4000 taxa it also runs 2–3 times faster than GARLI. RAxML has been parallelized with MPI to conduct parallel multiple bootstraps and inferences on distinct starting trees. The program has been used to compute ML trees on two of the largest alignments to date containing 25 057 (1463 bp) and 2182 (51 089 bp) taxa, respectively.
Availability:
Contact: [email protected]
Supplementary information: Supplementary data are available at Bioinformatics online.
Từ khóa
Tài liệu tham khảo
Chor, 2005, Maximum likelihood of evolutionary trees: hardness and approximation, Bioinformatics, 21, 97, 10.1093/bioinformatics/bti1027
Guindon, 2003, A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood, Syst. Biol., 52, 696, 10.1080/10635150390235520
Hordijk, 2005, Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood, Bioinformatics, 21, 4338, 10.1093/bioinformatics/bti713
Ley, 2005, Obesity alters gut microbial ecology, Proc. Natl Acad. Sci. USA, 102, 11070, 10.1073/pnas.0504978102
Ley, 2006, Unexpected diversity and complexity of the guerrero negro hypersaline microbial mat, Appl. Envir. Microbiol., 72, 3685, 10.1128/AEM.72.5.3685-3695.2006
Minh, 2005, pIQPNNI: parallel reconstruction of large maximum likelihood phylogenies, Bioinformatics, 21, 3794, 10.1093/bioinformatics/bti594
Robertson, 2005, Phylogenetic diversity and ecology of environmental Archaea, Curr. Opin. Microbiol., 8, 638, 10.1016/j.mib.2005.10.003
Ronquist, 2003, Mrbayes 3: bayesian phylogenetic inference under mixed models, Bioinformatics, 19, 1572, 10.1093/bioinformatics/btg180
Stamatakis, 2006, Phylogenetic models of rate heterogeneity: a high performance computing perspective, 10.1109/IPDPS.2006.1639535
Stamatakis, 2005, Raxml-iii: a fast program for maximum likelihood-based inference of large phylogenetic trees, Bioinformatics, 21, 456, 10.1093/bioinformatics/bti191
Zwickl D. Genetic algorithm approaches for the phylogenetic analysis of large biologiical sequence datasets under the maximum likelihood criterion 2006 TX University of Texas at Austin PhD thesis