KaKs_Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging

Genomics, Proteomics & Bioinformatics - Tập 4 Số 4 - Trang 259-263 - 2006
Zhang Zhang1,2,3, Jun Li1, Xiaoqian Zhao1,2, Jun Wang1,3,4, Gane Ka‐Shu Wong1,4,5, Jun Yu1,3,4
1Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 101300, China
2Graduate School of Chinese Academy of Sciences, Beijing 100049, China
3Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China
4James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou 310007, China
5UW Genome Center, University of Washington, Seattle, WA 98195, USA

Tóm tắt

Abstract

KaKs_Calculator is a software package that calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates through model selection and model averaging. Since existing methods for this estimation adopt their specific mutation (substitution) models that consider different evolutionary features, leading to diverse estimates, KaKs_Calculator implements a set of candidate models in a maximum likelihood framework and adopts the Akaike information criterion to measure fitness between models and data, aiming to include as many features as needed for accurately capturing evolutionary information in protein-coding sequences. In addition, several existing methods for calculating Ka and Ks are also incorporated into this software. KaKs_Calculator, including source codes, compiled executables, and documentation, is freely available for academic use at http://evolution.genomics.org.cn/software.htm.

Từ khóa


Tài liệu tham khảo

Kimura, 1983, The Neutral Theory of Molecular Evolution, 10.1017/CBO9780511623486

Li, 1997, Molecular Evolution

Fay, 2003, Sequence divergence, functional constraint, and selection in protein evolution, Annu. Rev. Genomics Hum. Genet., 4, 213, 10.1146/annurev.genom.4.020303.162528

Yang, 2000, Statistical methods for detecting molecular adaptation, Trends Ecol. Evol., 15, 496, 10.1016/S0169-5347(00)01994-7

Muse, 1996, Estimating synonymous and nonsynonymous substitution rates, Mol. Biol. Evol., 13, 105, 10.1093/oxfordjournals.molbev.a025549

Sullivan, 2005, Model selection in phylogenetics, Annu. Rev. Ecol. Evol. Syst., 36, 445, 10.1146/annurev.ecolsys.36.102003.152633

Pybus, 2006, Model selection and the molecular clock, PLoS Biol., 4, e151, 10.1371/journal.pbio.0040151

Posada, 2004, Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests, Syst. Biol., 53, 793, 10.1080/10635150490522304

Jukes, 1969, Evolution of protein molecules, Mammalian Protein Metabolism, 21, 10.1016/B978-1-4832-3211-9.50009-7

Felsenstein, 1981, Evolutionary trees from DNA sequences: a maximum likelihood approach, J. Mol. Evol., 17, 368, 10.1007/BF01734359

Kimura, 1980, A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences, J. Mol. Evol., 16, 111, 10.1007/BF01731581

Hasegawa, 1985, Dating of the human-ape splitting by a molecular clock of mitochondrial DNA, J. Mol. Evol., 22, 160, 10.1007/BF02101694

Tamura, 1993, Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees, Mol. Biol. Evol., 10, 512

Kimura, 1981, Estimation of evolutionary distances between homologous nucleotide sequences, Proc. Natl. Acad. Sci. USA, 78, 454, 10.1073/pnas.78.1.454

Zharkikh, 1994, Estimation of evolutionary distances between nucleotide sequences, J. Mol. Evol., 39, 315, 10.1007/BF00160155

Tavare, 1986, Some probabilistic and statistical problems in the analysis of DNA sequences, Lect. Math. Life Sci., 17, 57

Posada, 2003, Using Modeltest and PAUP* to select a model of nucleotide substitution, Current Protocols in Bioinformatics

Lio, 1998, Models of molecular evolution and phylogeny, Genome Res., 8, 1233, 10.1101/gr.8.12.1233

Goldman, 1994, A codon-based model of nucleotide substitution for protein-coding DNA sequences, Mol. Biol. Evol., 11, 725

Muse, 1994, A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome, Mol. Biol. Evol., 11, 715

Akaike, 1974, A new look at the statistical model identification, IEEE Trans. Autom. Control, 19, 716, 10.1109/TAC.1974.1100705

Comeron, 1999, K-Estimator: calculation of the number of nucleotide substitutions per site and the confidence intervals, Bioinformatics, 15, 763, 10.1093/bioinformatics/15.9.763

Yang, 1997, PAML: a program package for phylogenetic analysis by maximum likelihood, Comput. Appl. Biosci., 13, 555

Zhang, 2006, Evaluation of six methods for estimating synonymous and nonsynonymous substitution rates, Genomics Proteomics Bioinformatics, 4, 173, 10.1016/S1672-0229(06)60030-2

Li, 1993, Unbiased estimation of the rates of synonymous and nonsynonymous substitution, J. Mol. Evol., 36, 96, 10.1007/BF02407308

Nei, 1986, Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions, Mol. Biol. Evol., 3, 418

Yang, 2000, Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models, Mol. Biol. Evol., 17, 32, 10.1093/oxfordjournals.molbev.a026236

Li, 1985, A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes, Mol. Biol. Evol., 2, 150

Pamilo, 1993, Evolution of the Zfx and Zfy genes: rates and interdependence between the genes, Mol. Biol. Evol., 10, 271

Tzeng, 2004, Comparison of three methods for estimating rates of synonymous and nonsynonymous nucleotide substitutions, Mol. Biol. Evol., 21, 2290, 10.1093/molbev/msh242

Zhang, 2006, Computing Ka and Ks with a consideration of unequal transitional substitutions, BMC Evol. Biol., 6, 44, 10.1186/1471-2148-6-44

Posada, 1998, MODELTEST: testing the model of DNA substitution, Bioinformatics, 14, 817, 10.1093/bioinformatics/14.9.817