RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models

Bioinformatics - Tập 22 Số 21 - Trang 2688-2690 - 2006
Alexandros Stamatakis1
1Swiss Federal Institute of Technology Lausanne, School of Computer and Communication Sciences   Lab Prof. Moret, STATION 14, CH-1015 Lausanne, Switzerland

Tóm tắt

Abstract

Summary: RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML). Low-level technical optimizations, a modification of the search algorithm, and the use of the GTR+CAT approximation as replacement for GTR+Γ yield a program that is between 2.7 and 52 times faster than the previous version of RAxML. A large-scale performance comparison with GARLI, PHYML, IQPNNI and MrBayes on real data containing 1000 up to 6722 taxa shows that RAxML requires at least 5.6 times less main memory and yields better trees in similar times than the best competing program (GARLI) on datasets up to 2500 taxa. On datasets ≥4000 taxa it also runs 2–3 times faster than GARLI. RAxML has been parallelized with MPI to conduct parallel multiple bootstraps and inferences on distinct starting trees. The program has been used to compute ML trees on two of the largest alignments to date containing 25 057 (1463 bp) and 2182 (51 089 bp) taxa, respectively.

Availability:  

Contact:  [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

Từ khóa


Tài liệu tham khảo

Chor, 2005, Maximum likelihood of evolutionary trees: hardness and approximation, Bioinformatics, 21, 97, 10.1093/bioinformatics/bti1027

Guindon, 2003, A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood, Syst. Biol., 52, 696, 10.1080/10635150390235520

Hordijk, 2005, Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood, Bioinformatics, 21, 4338, 10.1093/bioinformatics/bti713

Ley, 2005, Obesity alters gut microbial ecology, Proc. Natl Acad. Sci. USA, 102, 11070, 10.1073/pnas.0504978102

Ley, 2006, Unexpected diversity and complexity of the guerrero negro hypersaline microbial mat, Appl. Envir. Microbiol., 72, 3685, 10.1128/AEM.72.5.3685-3695.2006

Minh, 2005, pIQPNNI: parallel reconstruction of large maximum likelihood phylogenies, Bioinformatics, 21, 3794, 10.1093/bioinformatics/bti594

Robertson, 2005, Phylogenetic diversity and ecology of environmental Archaea, Curr. Opin. Microbiol., 8, 638, 10.1016/j.mib.2005.10.003

Ronquist, 2003, Mrbayes 3: bayesian phylogenetic inference under mixed models, Bioinformatics, 19, 1572, 10.1093/bioinformatics/btg180

Stamatakis, 2006, Phylogenetic models of rate heterogeneity: a high performance computing perspective, 10.1109/IPDPS.2006.1639535

Stamatakis, 2005, Raxml-iii: a fast program for maximum likelihood-based inference of large phylogenetic trees, Bioinformatics, 21, 456, 10.1093/bioinformatics/bti191

Zwickl D. Genetic algorithm approaches for the phylogenetic analysis of large biologiical sequence datasets under the maximum likelihood criterion 2006 TX University of Texas at Austin PhD thesis