IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era

Molecular Biology and Evolution - Tập 37 Số 5 - Trang 1530-1534 - 2020
Bùi Quang Minh1,2, Heiko A. Schmidt3, Olga Chernomor3, Dominik Schrempf3,4, Michael D. Woodhams5, Arndt von Haeseler6,3, Robert Lanfear1
1Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT, Australia
2Research School of Computer Science, Australian National University, Canberra, ACT, Australia
3Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna and Medical University of Vienna, Vienna, Austria
4Department of Biological Physics, Eötvös Lórand University, Budapest, Hungary
5Discipline of Mathematics, University of Tasmania, Hobart, TAS, Australia
6Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria

Tóm tắt

AbstractIQ-TREE (http://www.iqtree.org, last accessed February 6, 2020) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.

Từ khóa


Tài liệu tham khảo

Afgan, 2018, The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update, Nucleic Acids Res, 46, W537, 10.1093/nar/gky379

Anisimova, 2006, Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative, Syst Biol, 55, 539, 10.1080/10635150600755453

Anisimova, 2011, Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes, Syst Biol, 60, 685, 10.1093/sysbio/syr041

Biczok, 2018, Two C plus plus libraries for counting trees on a phylogenetic terrace, Bioinformatics, 34, 3399, 10.1093/bioinformatics/bty384

Bolyen, 2019, Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2, Nat Biotechnol, 37, 852, 10.1038/s41587-019-0209-9

Boussau, 2006, Efficient likelihood computations with nonreversible models of evolution, Syst Biol, 55, 756, 10.1080/10635150600975218

Chernomor, 2015, Consequences of common topological rearrangements for partition trees in phylogenomic inference, J Comput Biol, 22, 1129, 10.1089/cmb.2015.0146

Chernomor, 2016, Terrace aware data structure for phylogenomic inference from supermatrices, Syst Biol, 65, 997, 10.1093/sysbio/syw037

Crotty, 2019, GHOST: recovering historical signal from heterotachously-evolved sequence alignments, Syst Biol, 10.1093/sysbio/syz051

Dornburg, 2016, PhyInformR: phylogenetic experimental design and phylogenomic data exploration in R, BMC Evol Biol, 16, 262, 10.1186/s12862-016-0837-3

Emms, 2015, OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy, Genome Biol, 16, 10.1186/s13059-015-0721-2

Felsenstein, 1981, Evolutionary trees from DNA sequences—a maximum likelihood approach, J Mol Evol, 17, 368, 10.1007/BF01734359

Felsenstein, 2004, Inferring phylogenies

Fong, 2012, A phylogenomic approach to vertebrate phylogeny supports a turtle-archosaur affinity and a possible paraphyletic lissamphibia, PLoS One, 7, e48990, 10.1371/journal.pone.0048990

Gascuel, 1997, BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data, Mol Biol Evol, 14, 685, 10.1093/oxfordjournals.molbev.a025808

Grama, 2003, Introduction to parallel computing

Gu, 1995, Maximum-likelihood-estimation of the heterogeneity of substitution rate among nucleotide sites, Mol Biol Evol, 12, 546

Guennebaud, 2010

Guindon, 2010, New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0, Syst Biol, 59, 307, 10.1093/sysbio/syq010

Hadfield, 2018, Nextstrain: real-time tracking of pathogen evolution, Bioinformatics, 34, 4121, 10.1093/bioinformatics/bty407

Hoang, 2018, UFBoot2: improving the ultrafast bootstrap approximation, Mol Biol Evol, 35, 518, 10.1093/molbev/msx281

Izquierdo-Carrasco, 2012

Kalyaanamoorthy, 2017, ModelFinder: fast model selection for accurate phylogenetic estimates, Nat Methods, 14, 587, 10.1038/nmeth.4285

Kozlov, 2019, RAxML-NG: a fast, scalable, and user-friendly tool for maximum likelihood phylogenetic inference, Bioinformatics, 35, 4453, 10.1093/bioinformatics/btz305

Lanfear, 2012, PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses, Mol Biol Evol, 29, 1695, 10.1093/molbev/mss020

Le, 2012, Modeling protein evolution with several amino acid replacement matrices depending on site rates, Mol Biol Evol, 29, 2921, 10.1093/molbev/mss112

Le, 2010, Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial, Syst Biol, 59, 277, 10.1093/sysbio/syq002

Le, 2008, Phylogenetic mixture models for proteins, Philos Trans R Soc B, 363, 3965, 10.1098/rstb.2008.0180

Lemey, 2009, The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing, 10.1017/CBO9780511819049

Lewis, 2001, A likelihood approach to estimating phylogeny from discrete morphological character data, Syst Biol, 50, 913, 10.1080/106351501753462876

Mayrose, 2004, Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior, Mol Biol Evol, 21, 1781, 10.1093/molbev/msh194

Minh, 2018, 10.1101/487801

Minh, 2013, Ultrafast approximation for phylogenetic bootstrap, Mol Biol Evol, 30, 1188, 10.1093/molbev/mst024

Mirarab, 2014, ASTRAL: genome-scale coalescent-based species tree estimation, Bioinformatics, 30, i541, 10.1093/bioinformatics/btu462

Moler, 1978, Nineteen dubious ways to compute the exponential of a matrix, SIAM Rev, 20, 801, 10.1137/1020098

Morel, 2019, ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes, Bioinformatics, 35, 1771, 10.1093/bioinformatics/bty839

Nguyen, 2015, IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies, Mol Biol Evol, 32, 268, 10.1093/molbev/msu300

Norris, 1997, Markov chains, 10.1017/CBO9780511810633

Price, 2010, FastTree 2—approximately maximum-likelihood trees for large alignments, PLoS One, 5, e9490, 10.1371/journal.pone.0009490

Sanderson, 2011, Terraces in phylogenetic tree space, Science, 333, 448, 10.1126/science.1206357

Schmidt, 2002, TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing, Bioinformatics, 18, 502, 10.1093/bioinformatics/18.3.502

Schrempf, 2016, Reversible polymorphism-aware phylogenetic models and their application to tree inference, J Theor Biol, 407, 362, 10.1016/j.jtbi.2016.07.042

Schrempf, 2019, Polymorphism-aware species trees with advanced mutation models, bootstrap, and rate heterogeneity, Mol Biol Evol, 36, 1294, 10.1093/molbev/msz043

Shimodaira, 2002, An approximately unbiased test of phylogenetic tree selection, Syst Biol, 51, 492, 10.1080/10635150290069913

Shimodaira, 1999, Multiple comparisons of log-likelihoods with applications to phylogenetic inference, Mol Biol Evol, 16, 1114, 10.1093/oxfordjournals.molbev.a026201

Shimodaira, 2001, CONSEL: for assessing the confidence of phylogenetic tree selection, Bioinformatics, 17, 1246, 10.1093/bioinformatics/17.12.1246

Snir, 1998, MPI: the complete reference—the MPI core

Stamatakis, 2014, RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies, Bioinformatics, 30, 1312, 10.1093/bioinformatics/btu033

Strimmer, 2002, Inferring confidence sets of possibly misspecified gene trees, Proc R Soc Lond B, 269, 137, 10.1098/rspb.2001.1862

Strimmer, 1997, Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment, Proc Natl Acad Sci U S A, 94, 6815, 10.1073/pnas.94.13.6815

Wang, 2018, Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation, Syst Biol, 67, 216, 10.1093/sysbio/syx068

Whelan, 2017, Ctenophore relationships and their placement as the sister group to all other animals, Nat Ecol Evol, 1, 1737, 10.1038/s41559-017-0331-3

Woodhams, 2015, A new hierarchy of phylogenetic models consistent with heterogeneous substitution rates, Syst Biol, 64, 638, 10.1093/sysbio/syv021

Yang, 1994, Estimating the pattern of nucleotide substitution, J Mol Evol, 39, 105, 10.1007/BF00178256

Yang, 1994, Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods, J Mol Evol, 39, 306, 10.1007/BF00160154

Zhou, 2018, Evaluating fast maximum likelihood-based phylogenetic programs using empirical phylogenomic data sets, Mol Biol Evol, 35, 486, 10.1093/molbev/msx302