Improvement of Phylogenies after Removing Divergent and Ambiguously Aligned Blocks from Protein Sequence Alignments

Systematic Biology - Tập 56 Số 4 - Trang 564-577 - 2007
Gerard Talavera1, José Castresana1
1Department of Physiology and Molecular Biodiversity, Institute of Molecular Biology of Barcelona CSIC, Jordi Girona 18, Barcelona, 08034, Spain E-mail: [email protected] (J.C.)

Tóm tắt

Từ khóa


Tài liệu tham khảo

Aagesen, 2004, The information content of an ambiguously alignable region, a case study of the trnL intron from the Rhamnaceae, Organ. Divers. Evol., 4, 35, 10.1016/j.ode.2003.11.003

Blackshields, 2006, Analysis and comparison of benchmarks for multiple sequence alignment, In Silico Biol., 6, 321

Castresana, 2000, Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis, Mol. Biol. Evol., 17, 540, 10.1093/oxfordjournals.molbev.a026334

Castresana, 1998, Codon reassignment and amino acid composition in hemichordate mitochondria, Proc. Natl. Acad. Sci. USA, 95, 3703, 10.1073/pnas.95.7.3703

Castresana, 1998, The mitochondrial genome of the hemichordate Balanoglossus carnosus and the evolution of deuterostome mitochondria, Genetics, 150, 1115, 10.1093/genetics/150.3.1115

Dayhoff, 1978, A model of evolutionary change in proteins, Atlas of protein sequence structure, 345

Delsuc, 2005, Phylogenomics and the reconstruction of the tree of life, Nat. Rev. Genet., 6, 361, 10.1038/nrg1603

Do, 2005, ProbCons: Probabilistic consistency-based multiple sequence alignment, Genome Res., 15, 330, 10.1101/gr.2821705

Drummond, 2001, PAL: An object-oriented programming library for molecular evolution and phylogenetics, Bioinformatics, 17, 662, 10.1093/bioinformatics/17.7.662

Edgar, 2004, MUSCLE: Multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Res., 32, 1792, 10.1093/nar/gkh340

Felsenstein, 1989, PHYLIP—Phylogeny inference package (version 3.4), Cladistics, 5, 164

Felsenstein, 2004, Inferring phylogenies

Feng, 1987, Progressive sequence alignment as a prerequisite to correct phylogenetic trees, J. Mol. Evol., 25, 351, 10.1007/BF02603120

Fleissner, 2005, Simultaneous statistical multiple alignment and phylogeny reconstruction, Syst. Biol., 54, 548, 10.1080/10635150590950371

Gatesy, 1993, Alignment-ambiguous nucleotide sites and the exclusion of systematic data, Mol. Phylogenet. Evol., 2, 152, 10.1006/mpev.1993.1015

Geiger, 2002, Stretch coding and block coding: Two new strategies to represent questionably aligned DNA sequences, J. Mol. Evol., 54, 191, 10.1007/s00239-001-0001-5

Grundy, 1999, Phylogenetic inference from conserved sites alignments, J. Exp. Zool., 285, 128, 10.1002/(SICI)1097-010X(19990815)285:2<128::AID-JEZ5>3.0.CO;2-C

Guindon, 2003, A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood, Syst. Biol., 52, 696, 10.1080/10635150390235520

Gutell, 1994, Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective, Microbiol. Rev., 58, 10, 10.1128/MMBR.58.1.10-26.1994

Henikoff, 1994, Protein family classification based on searching a database of blocks, Genomics, 19, 97, 10.1006/geno.1994.1018

Herrmann, 1996, CONRAD: A method for identification of variable and conserved regions within proteins by scale-space filtering, Comput. Appl. Biosci., 12, 197

Higgins, 2005, Mind the gaps: Progress in progressive alignment, Proc. Natl. Acad. Sci. USA, 102, 10411, 10.1073/pnas.0504801102

Jeffroy, 2006, Phylogenomics: The beginning of incongruence?, Trends Genet., 22, 225, 10.1016/j.tig.2006.02.003

Jones, 1992, The rapid generation of mutation data matrices from protein sequences, Comput. Appl. Biosci., 8, 275

Katoh, 2005, MAFFT version 5: Improvement in accuracy of multiple sequence alignment, Nucleic Acids Res., 33, 511, 10.1093/nar/gki198

Katoh, 2002, MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform, Nucleic Acids Res., 30, 3059, 10.1093/nar/gkf436

Kjer, 1995, Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: an example of alignment and data presentation from the frogs, Mol. Phylogenet. Evol., 4, 314, 10.1006/mpev.1995.1028

Lake, 1991, The order of sequence alignment can bias the selection of tree topology, Mol. Biol. Evol., 8, 378

Lassmann, 2005, Kalign—An accurate and fast multiple sequence alignment algorithm, BMC Bioinformatics, 6, 298, 10.1186/1471-2105-6-298

Lee, 2001, Unalignable sequences and molecular evolution, Trends Ecol. Evol., 16, 681, 10.1016/S0169-5347(01)02313-8

Löytynoja, 2001, SOAP, cleaning multiple alignments from unstable blocks, Bioinformatics, 17, 573, 10.1093/bioinformatics/17.6.573

Lunter, 2005, Bayesian coestimation of phylogeny and sequence alignment, BMC Bioinformatics, 6, 83, 10.1186/1471-2105-6-83

Lutzoni, 2000, Integrating ambiguously aligned regions of DNA sequences in phylogenetic analyses without violating positional homology, Syst. Biol., 49, 628, 10.1080/106351500750049743

Morrison, 1997, Effects of nucleotide sequence alignment on phylogeny estimation: A case study of 18S rDNAs of apicomplexa, Mol. Biol. Evol., 14, 428, 10.1093/oxfordjournals.molbev.a025779

Needleman, 1970, A general method applicable to the search for similarities in the amino acid sequence of two proteins, J. Mol. Biol., 48, 443, 10.1016/0022-2836(70)90057-4

Notredame, 2000, T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. Mol. Biol., 302, 205, 10.1006/jmbi.2000.4042

Nuin, 2006, The accuracy of several multiple sequence alignment programs for proteins, BMC Bioinformatics, 7, 471, 10.1186/1471-2105-7-471

Ogden, 2006, Multiple sequence alignment accuracy and phylogenetic inference, Syst. Biol., 55, 314, 10.1080/10635150500541730

Pesole, 1992, A statistical method for detecting regions with different evolutionary dynamics in multialigned sequences, Mol. Phylogenet. Evol., 1, 91, 10.1016/1055-7903(92)90023-A

Philippe, 1998, How good are deep phylogenetic trees? Curr, Opin. Genet. Dev., 8, 616, 10.1016/S0959-437X(98)80028-2

Redelings, 2005, Joint Bayesian estimation of alignment and phylogeny, Syst. Biol., 54, 401, 10.1080/10635150590947041

Robinson, 1981, Comparison of phylogenetic trees, Math. Biosci., 53, 131, 10.1016/0025-5564(81)90043-2

Rodrigo, 1994, Inadequate support for an evolutionary link between the Metazoa and the Fungi, Syst. Biol., 43, 578, 10.1093/sysbio/43.4.578

Smythe, 2006, Nematode small subunit phylogeny correlates with alignment parameters, Syst. Biol., 55, 972, 10.1080/10635150601089001

Stajich, 2002, The Bioperl toolkit: Perl modules for the life sciences, Genome Res., 12, 1611, 10.1101/gr.361602

Stoye, 1998, Rose: Generating sequence families, Bioinformatics, 14, 157, 10.1093/bioinformatics/14.2.157

Strimmer, 1996, Quartet puzzling: A quartet maximum-likelihood method for reconstructing tree topologies, Mol. Biol. Evol., 13, 964, 10.1093/oxfordjournals.molbev.a025664

Swofford, 1996, Phylogenetic inference, Molecular systematics, 407

Tatusov, 2003, The COG database: an updated version includes eukaryotes, BMC Bioinformatics, 4, 41, 10.1186/1471-2105-4-41

Thompson, 1994, CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Res., 22, 4673, 10.1093/nar/22.22.4673

Thompson, 2005, BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark, Proteins, 61, 127, 10.1002/prot.20527

Wheeler, 2001, Homology and the optimization of DNA sequence data, Cladistics, 17, S3, 10.1111/j.1096-0031.2001.tb00100.x

Xia, 2003, 18S ribosomal RNA and tetrapod phylogeny, Syst. Biol., 52, 283, 10.1080/10635150390196948

Yang, 1998, On the best evolutionary rate for phylogenetic analysis, Syst. Biol., 47, 125, 10.1080/106351598261067

Young, 2003, GapCoder automates the use of indel characters in phylogenetic analysis, BMC Bioinformatics, 4, 6, 10.1186/1471-2105-4-6