Timing the Ancestor of the HIV-1 Pandemic Strains
Tóm tắt
HIV-1 sequences were analyzed to estimate the timing of the ancestral sequence of the main group of HIV-1, the strains responsible for the AIDS pandemic. Using parallel supercomputers and assuming a constant rate of evolution, we applied maximum-likelihood phylogenetic methods to unprecedented amounts of data for this calculation. We validated our approach by correctly estimating the timing of two historically documented points. Using a comprehensive full-length envelope sequence alignment, we estimated the date of the last common ancestor of the main group of HIV-1 to be 1931 (1915–41). Analysis of a gag gene alignment, subregions of envelope including additional sequences, and a method that relaxed the assumption of a strict molecular clock also supported these results.
Từ khóa
Tài liệu tham khảo
. S. Souquiere et al. paper presented at the 7th Conference on Retroviruses and Opportunistic Infections San Francisco 2000 (www.retroconference.org/).
D. L. Robertson et al. in Human Retroviruses and AIDS 1999 C. Kuiken et al. Eds. (Los Alamos National Laboratory Los Alamos NM in press) (available at hiv-web.lanl.gov);
Meyer A., et al., Med. Trop. 51, 53 (1991);
Report from the Joint United Nations Programme on HIV/AIDS Global HIV/AIDS epidemic update 1999 available at www.unaids.org/publications/.
Leitner T., Escanilla D., Franzen C., Uhlen M., Albert J., Proc. Natl. Acad. Sci. U.S.A. 93, 10864 (1996);
Olsen G. J., Matsuda H., Hagstrom R., Overbeek R., Comput. Appl. Biosci. 10, 41 (1994).
FastDNAml and DNArates were written by Gary Olsen and colleagues at the Ribosomal Database Project (RDP) at the University of Illinois at Urbana-Champaign (available by anonymous ftp from /).
D. L. Swofford G. J. Olsen P. J. Waddell D. M. Hillis in Molecular Systematics D. M. Hillis C. Moritz B. K. Mable Eds. (Sinauer Sunderland MA 1996) pp. 407–514.
D. M. Hillis B. K. Mable C. Moritz in Molecular Systematics 2nd ed. D. M. Hillis C. Moritz B. K. Mable Eds. (Sinauer Sunderland MA (1996) pp. 515–543.
The alignment was based on those provided in Human Retroviruses and AIDS B. Korber et al. Eds. (Los Alamos National Laboratory Los Alamos NM 1998).
A complete description of the alignments details of the phylogenetic analysis the sequence alignments and the links new code written for this study are provided at www.santafe.edu/btk/science-paper/bette.html.
; S. Ganeshan et al.
Grassly N., Harvery P., Holmes E., Genetics 151, 427 (1999).
Yang Z., J. Mol. Evol. 39, 105 (1994);
Yang Z., Goldman N., Friday A., Mol. Biol. Evol. 11, 316 (1994);
Huelsenbeck J., Mol. Biol. Evol. 12, 843 (1995);
Kuhner M., Felsenstein J., Mol. Biol. Evol. 11, 459 (1994).
Because it is not possible to test all possible tree configurations tree-building programs use heuristics to estimate the best tree and the final tree is dependent on the input order of sequences. To optimize the final trees we randomized the input order of the sequences five to seven times until the best maximum-likelihood scores were very similar (1). Given the number of taxa we included and consequently the combinatorially vast potential for different branching orders we do not expect our trees to be optimal solutions. Limited testing of the final timing estimates based on different input orders of sequences did not significantly affect our calculations of the timing of divergence from a common ancestor. We also compared the likelihood of the data under different evolutionary models (1) and over 100 maximum-likelihood trees were run in the course of this study.
Yang Z., Mol. Biol. Evol. 10, 1396 (1993).
D. L. Swofford PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods) (Sinauer Sunderland MA 1999).
We also tested other aspects of our evolutionary model. We found that the assignment of base frequencies by means of the phylogenetic trees gave consistently better results than empirical base frequencies. The REV model performed better than an F84 model (2) which only includes rate parameters for transitions and transversions instead of for each pair of bases. Also for the envelope gene analyses the improvement in the log-likelihood scores comparing the REV model with a uniform rate of evolution at all sites to the REV model with rate variation at different sites estimated by the maximum-likelihood method was many times larger than the number of positions (1) justifying the increase in parameters (3).
B. Korber et al. data not shown.
Wangroongsarb Y., et al., Southeast Asian J. Trop. Med. Public Health 16, 517 (1985);
C. Kuiken et al. Am. J. Epidemiol. in press.
Gottlieb M. S., et al., Morb. Mortal. Wkly. Rep. 30, 250 (1981).
E. Hooper. The River (Little Brown Boston 1999). See pp. 77–82 and 440–443 for discussion of early cases in the United States and Haiti and pp. 550 791 and 1009 for a discussion of the number of primate kidneys required to make OPV.
Li W.-H., Tanimura M., Sharp P., Mol. Biol. Evol. 5, 313 (1988);
; T. Gojobori et al. Proc. Natl. Acad. Sci. U.S.A. 340 1605 4108 (1990); J. Kelly Genet. Res. 64 1 1994.
A record of the ages of chimpanzees from Camp Lindi used for research noted a range from <1 to 10 years with more than 80% less than 4 years old (S. Plotkin personal communication; data taken from the laboratory notes of F. Deinhardt).
M. Grmek. History of AIDS Emergence and Origin of a Modern Pandemic (Princeton Univ. Press Princeton NJ 1990) chaps. 10 and 15.
We thank D. Pollock T. Leitner and B. Bruno for suggestions concerning phylogenetics maximum likelihood and estimating the error on time of sampling; G. Shaw for suggesting the 1959 control; S. Wain-Hobson and G. Myers for clarifying discussions on the interpretation and limitations of these results; B. Foley and C. Kuiken for numerous helpful discussions; and K. Rock and J. Shepard for technical support. G. Olsen and J. Thorne generously supplied source code and helped us interpret their work. The research of the Los Alamos authors was supported under internal funds from the Delphi Project S.W. and B.K. were supported by NIH (RO1-HD37356) B.K. and M.M. were supported through the Pediatric AIDS Foundation and an anonymous foundation supplied further support for S.W. B.H.H. was supported by grants NO1 AI 85338 RO1 AI 44596 and RO1 AI 40951 from NIH.