Telling the whole story in a 10,000-genome world
Tóm tắt
Genome sequencing has revolutionized our view of the relationships among genomes, particularly in revealing the confounding effects of lateral genetic transfer (LGT). Phylogenomic techniques have been used to construct purported trees of microbial life. Although such trees are easily interpreted and allow the use of a subset of genomes as "proxies" for the full set, LGT and other phenomena impact the positioning of different groups in genome trees, confounding and potentially invalidating attempts to construct a phylogeny-based taxonomy of microorganisms. Network and graph approaches can reveal complex sets of relationships, but applying these techniques to large data sets is a significant challenge. Notwithstanding the question of what exactly it might represent, generating and interpreting a Tree or Network of All Genomes will only be feasible if current algorithms can be improved upon. Complex relationships among even the most-similar genomes demonstrate that proxy-based approaches to simplifying large sets of genomes are not alone sufficient to solve the analysis problem. A phylogenomic analysis of 1173 sequenced bacterial and archaeal genomes generated phylogenetic trees for 159,905 distinct homologous gene sets. The relationships inferred from this set can be heavily dependent on the inclusion of other taxa: for example, phyla such as Spirochaetes, Proteobacteria and Firmicutes are recovered as cohesive groups or split depending on the presence of other specific lineages. Furthermore, named groups such as Acidithiobacillus, Coprothermobacter and Brachyspira show a multitude of affiliations that are more consistent with their ecology than with small subunit ribosomal DNA-based taxonomy. Network and graph representations can illustrate the multitude of conflicting affinities, but all methods impose constraints on the input data and create challenges of construction and interpretation. These complex relationships highlight the need for an inclusive approach to genomic data, and current methods with minor alterations will likely scale to allow the analysis of data sets with 10,000 or more genomes. The main challenges lie in the visualization and interpretation of genomic relationships, and the redefinition of microbial taxonomy when subsets of genomic data are so evidently in conflict with one another, and with the "canonical" molecular taxonomy. The manuscript was reviewed by William Martin, W. Ford Doolittle, Joel Velasco and Eugene Koonin.
Tài liệu tham khảo
Wu D, Hugenholtz P, Mavromatis K, Pukall R, Dalin E, Ivanova NN, Kunin V, Goodwin L, Wu M, Tindall BJ, Hooper SD, Pati A, Lykidis A, Spring S, Anderson IJ, D'haeseleer P, Zemla A, Singer M, Lapidus A, Nolan M, Copeland A, Han C, Chen F, Cheng JF, Lucas S, Kerfeld C, Lang E, Gronow S, Chain P, Bruce D, Rubin EM, Kyrpides NC, Klenk HP, Eisen JA: A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea. Nature. 2009, 462: 1056-1060. 10.1038/nature08656.
Kennedy J, Codling CE, Jones BV, Dobson AD, Marchesi JR: Diversity of microbes associated with the marine sponge, Haliclona simulans, isolated from Irish waters and identification of polyketide synthase genes from the sponge metagenome. Environ Microbiol. 2008, 10: 1888-1902. 10.1111/j.1462-2920.2008.01614.x.
Simon C, Wiezer A, Strittmatter AW, Daniel R: Phylogenetic diversity and metabolic potential revealed in a glacier ice metagenome. Appl Environ Microbiol. 2009, 75: 7519-7526. 10.1128/AEM.00946-09.
Woese CR, Fox GE: Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci USA. 1977, 74: 5088-5090. 10.1073/pnas.74.11.5088.
Woese CR, Kandler O, Wheelis ML: Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA. 1990, 87: 4576-4579. 10.1073/pnas.87.12.4576.
Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF: A kingdom-level phylogeny of eukaryotes based on combined protein data. Science. 2000, 290: 972-977. 10.1126/science.290.5493.972.
Brochier C, Forterre P, Gribaldo S: An emerging phylogenetic core of Archaea: phylogenies of transcription and translation machineries converge following addition of new genome sequences. BMC Evol Biol. 2005, 5: 36-10.1186/1471-2148-5-36.
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P: Toward automatic reconstruction of a highly resolved tree of life. Science. 2006, 311: 1283-1287. 10.1126/science.1123061.
Hampl V, Hug L, Leigh JW, Dacks JB, Lang BF, Simpson AG, Roger AJ: Phylogenomic analyses support the monophyly of Excavata and resolve relationships among eukaryotic "supergroups". Proc Natl Acad Sci USA. 2009, 106: 3859-3864. 10.1073/pnas.0807880106.
Snel B, Bork P, Huynen MA: Genome phylogeny based on gene content. Nat Genet. 1999, 21: 108-110. 10.1038/5052.
Wolf YI, Rogozin IB, Grishin NV, Tatusov RL, Koonin EV: Genome trees constructed using five different approaches suggest new major bacterial clades. BMC Evol Biol. 2001, 1: 8-10.1186/1471-2148-1-8.
Rivera M, Lake JA: The ring of life provides evidence for a genome fusion origin of eukaryotes. Nature. 2004, 431: 152-155. 10.1038/nature02848.
Gophna U, Doolittle WF, Charlebois RL: Weighted genome trees: refinements and applications. J Bacteriol. 2005, 187: 1305-1316. 10.1128/JB.187.4.1305-1316.2005.
Lienau EK, DeSalle R, Allard M, Brown EW, Swofford D, Rosenfeld JA, Sarkar IN, Planet PJ: The mega-matrix tree of life: using genome-scale horizontal gene transfer and sequence evolution data as information about the vertical history of life. Cladistics. 2010, 26: 1-11. 10.1111/j.1096-0031.2009.00297.x.
Creevey CJ, Fitzpatrick DA, Philip GK, Kinsella RJ, O'Connell MJ, Pentony MM, Travers SA, Wilkinson M, McInerney JO: Does a tree-like phylogeny only exist at the tips in the prokaryotes?. Proc R Sci B. 2004, 271: 2551-2558. 10.1098/rspb.2004.2864.
Beiko RG, Harlow TJ, Ragan MA: Highways of gene sharing in prokaryotes. Proc Natl Acad Sci USA. 2005, 102: 14332-14337. 10.1073/pnas.0504068102.
Pisani D, Cotton JA, McInerney JO: Supertrees disentangle the chimerical origin of eukaryotic genomes. Mol Biol Evol. 2007, 24: 1752-1760. 10.1093/molbev/msm095.
Puigbò P, Wolf YI, Koonin EV: Search for a 'Tree of Life' in the thicket of the phylogenetic forest. J Biol. 2009, 8: 59-10.1186/jbiol159.
Puigbò P, Wolf YI, Koonin EV: The tree and net components of prokaryote evolution. Genome Biol Evol. 2010, 2: 745-756. 10.1093/gbe/evq062.
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.
Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006, 22: 1658-1659. 10.1093/bioinformatics/btl158.
Edgar RC: Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010, 26: 2460-2461. 10.1093/bioinformatics/btq461.
Price MN, Dehal PS, Arkin AP: FastBLAST: homology relationships for millions of proteins. PLoS ONE. 2008, 3: e3589-10.1371/journal.pone.0003589.
Price MN, Dehal PS, Arkin AP: FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS ONE. 2010, 10: e9490-
Woolley SM, Posada D, Crandall KA: A comparison of phylogenetic network methods using computer simulation. PLoS ONE. 2008, 3: e1913-10.1371/journal.pone.0001913.
Pace NR: Mapping the tree of life: progress and prospects. Microbiol Mol Biol Rev. 2009, 73: 565-576. 10.1128/MMBR.00033-09.
Klenk HP, Göker M: En route to a genome-based classification of Archaea and Bacteria?. Syst Appl Microbiol. 2010, 33: 175-182. 10.1016/j.syapm.2010.03.003.
Galperin MY: Sorting out the mix in microbial genomics. Environ Microbiol. 2008, 10: 3187-3192. 10.1111/j.1462-2920.2008.01811.x.
Hugenholtz P, Hooper SD, Kyrpides NC: Focus: Synergistetes. Environ Microbiol. 2009, 11: 1327-1329. 10.1111/j.1462-2920.2009.01949.x.
Marchandin H, Damay A, Roudière L, Teyssier C, Zorgniotti I, Dechaud H, Jean-Pierre H, Jumas-Bilak E: Phylogeny, diversity and host specialization in the phylum Synergistetes with emphasis on strains and clones of human origin. Res Microbiol. 2010, 161: 91-100. 10.1016/j.resmic.2009.12.008.
Lee KC, Webb RI, Janssen PH, Sangwan P, Romeo T, Staley JT, Fuerst JA: Phylum Verrucomicrobia representatives share a compartmentalized cell plan with members of bacterial phylum Planctomycetes. BMC Microbiol. 2009, 9: 5-10.1186/1471-2180-9-5.
Sharma AK, Spudich JL, Doolittle WF: Microbial rhodopsins: functional versatility and genetic mobility. Trends Microbiol. 14: 463-469.
Dagan T, Martin W: Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution. Proc Natl Acad Sci USA. 2007, 104: 870-875. 10.1073/pnas.0606318104.
Kloesges T, Popa O, Martin W, Dagan T: Networks of Gene Sharing among 329 Proteobacterial Genomes Reveal Differences in Lateral Gene Transfer Frequency at Different Phylogenetic Depths. Mol Biol Evol. 2011, 28: 1057-1074. 10.1093/molbev/msq297.
Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, Angiuoli SV, Crabtree J, Jones AL, Durkin AS, Deboy RT, Davidsen TM, Mora M, Scarselli M, Margarit y Ros I, Peterson JD, Hauser CR, Sundaram JP, Nelson WC, Madupu R, Brinkac LM, Dodson RJ, Rosovitz MJ, Sullivan SA, Daugherty SC, Haft DH, Selengut J, Gwinn ML, Zhou L, Zafar N, Khouri H, Radune D, Dimitrov G, Watkins K, O'Connor KJ, Smith S, Utterback TR, White O, Rubens CE, Grandi G, Madoff LC, Kasper DL, Telford JL, Wessels MR, Rappuoli R, Fraser CM: Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". Proc Natl Acad Sci USA. 2005, 102: 13950-13955. 10.1073/pnas.0506758102.
Trost B, Haakensen M, Pittet V, Ziola B, Kusalik A: Analysis and comparison of the pan-genomic properties of sixteen well-characterized bacterial genera. BMC Microbiol. 2010, 10: 258-10.1186/1471-2180-10-258.
Welch RA, Burland V, Plunkett G, Redford P, Roesch P, Rasko D, Buckles EL, Liou SR, Boutin A, Hackett J, Stroud D, Mayhew GF, Rose DJ, Zhou S, Schwartz DC, Perna NT, Mobley HL, Donnenberg MS, Blattner FR: Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc Natl Acad Sci USA. 2002, 99: 17020-17024. 10.1073/pnas.252529799.
Stackebrandt E, Kramer I, Swiderski J, Hippe H: Phylogenetic basis for a taxonomic dissection of the genus Clostridium. FEMS Immunol Med Microbiol. 1999, 24: 253-258. 10.1111/j.1574-695X.1999.tb01291.x.
Robertson BR, Tezuka N, Watanabe MM: Phylogenetic analyses of Synechococcus strains (cyanobacteria) using sequences of 16S rDNA and part of the phycocyanin operon reveal multiple evolutionary lines and reflect phycobilin content. Int J Syst Evol Microbiol. 2001, 51: 861-871. 10.1099/00207713-51-3-861.
La Scola B, Zeaiter Z, Khamis A, Raoult D: Gene-sequence-based criteria for species definition in bacteriology: the Bartonella paradigm. Trends Microbiol. 2003, 11: 318-321. 10.1016/S0966-842X(03)00143-4.
Minegishi H, Kamekura M, Itoh T, Echigo A, Usami R, Hashimoto T: Further refinement of the phylogeny of the Halobacteriaceae based on the full-length RNA polymerase subunit B' (rpoB') gene. Int J Syst Evol Microbiol. 2010, 60: 2398-2408. 10.1099/ijs.0.017160-0.
Dagan T, Martin W: The tree of one percent. Genome Biol. 2006, 7: 118-10.1186/gb-2006-7-10-118.
Jain R, Rivera MC, Lake JA: Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA. 96: 3801-3806.
Sorek R, Zhu Y, Creevey CJ, Francino MP, Bork P, Rubin EM: Genome-wide experimental determination of barriers to horizontal gene transfer. Science. 2007, 318: 1449-1452. 10.1126/science.1147112.
Coenye T, Vandamme P: Organisation of the S10, spc and alpha ribosomal protein gene clusters in prokaryotic genomes. FEMS Microbiol Lett. 2005, 242: 117-126. 10.1016/j.femsle.2004.10.050.
Chen K, Roberts E, Luthey-Schulten Z: Horizontal gene transfer of zinc and non-zinc forms of bacterial ribosomal protein S4. BMC Evol Biol. 2009, 9: 179-10.1186/1471-2148-9-179.
Woese CR, Achenbach L, Rouviere P, Mandelco L: Archaeal phylogeny: reexamination of the phylogenetic position of Archaeoglobus fulgidus in light of certain composition-induced artifacts. Syst Appl Microbiol. 1991, 14: 364-371.
Studier JA, Keppler KJ: A note on the neighbor-joining algorithm of Saitou and Nei. Mol Biol Evol. 1988, 5: 729-731.
Lake JA, Rivera MC: Deriving the genomic tree of life in the presence of horizontal gene transfer: conditioned reconstruction. Mol Biol Evol. 2004, 21: 681-690. 10.1093/molbev/msh061.
Harris JK, Kelley ST, Spiegelman GB, Pace NR: The genetic core of the universal ancestor. Genome Res. 2003, 13: 407-412. 10.1101/gr.652803.
Charlebois RL, Doolittle WF: Computing prokaryotic gene ubiquity: rescuing the core from extinction. Genome Res. 2004, 14: 2469-2477. 10.1101/gr.3024704.
McCann A, Cotton JA, McInerney JO: The tree of genomes: an empirical comparison of genome-phylogeny reconstruction methods. BMC Evol Biol. 2008, 12: 312-
Spencer M, Bryant D, Susko E: Conditioned genome reconstruction: how to avoid choosing the conditioning genome. Syst Biol. 2007, 56: 25-43. 10.1080/10635150601156313.
Sangaralingam A, Susko E, Bryant D, Spencer M: On the artefactual parasitic eubacteria clan in conditioned logdet phylogenies: heterotachy and ortholog identification artefacts as explanations. BMC Evol Biol. 2010, 10: 343-10.1186/1471-2148-10-343.
Beiko RG, Doolittle WF, Charlebois RL: The impact of reticulate evolution on genome phylogeny. Syst Biol. 2008, 57: 844-856. 10.1080/10635150802559265.
Doolittle WF, Bapteste E: Pattern pluralism and the Tree of Life hypothesis. Proc Natl Acad Sci USA. 2007, 104: 2043-2049. 10.1073/pnas.0610699104.
Castro HF, Williams NH, Ogram A: Phylogeny of sulfate-reducing bacteria. FEMS Microbiol Ecol. 2000, 31: 1-9.
Kunisawa T: Evaluation of the phylogenetic position of the sulfate-reducing bacterium Thermodesulfovibrio yellowstonii (phylum Nitrospirae) by means of gene order data from completely sequenced genomes. Int J Syst Evol Microbiol. 2010, 60: 1090-1102. 10.1099/ijs.0.014266-0.
Huelsenbeck JP, Hillis DM: Success of phylogenetic methods in the four-taxon case. Syst Biol. 1993, 42: 247-264.
Bryant D, Moulton V: Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol. 2004, 21: 255-265.
Huson DH, Bryant D: Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006, 23: 254-267.
Beiko RG: Gene sharing and genome evolution: networks in trees and trees in networks. Biol Philos. 2010, 25: 659-673. 10.1007/s10539-010-9217-3.
Matte-Tailliez O, Brochier C, Forterre P, Philippe H: Archaeal phylogeny based on ribosomal proteins. Mol Biol Evol. 19: 631-639.
Frickey T, Lupas AN: PhyloGenie: automated phylome generation and analysis. Nucleic Acids Res. 2004, 32: 5231-5238. 10.1093/nar/gkh867.
Gupta SK, Banerjee T, Basak S, Sahu K, Sau S, Ghosh TC: Studies on codon usage in Thermoplasma acidophilum and its possible implications on the occurrences of lateral gene transfer. J Basic Microbiol. 45: 344-354.
Harlow TJ, Gogarten JP, Ragan MA: A hybrid clustering approach to recognition of protein families in 114 microbial genomes. BMC Bioinformatics. 2004, 5: 45-10.1186/1471-2105-5-45.
Enright AJ, Van Dongen S, Ouzounis CA: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 2002, 30: 1575-1584. 10.1093/nar/30.7.1575.
Poptsova MS, Gogarten JP: BranchClust: a phylogenetic algorithm for selecting gene families. BMC Bioinformatics. 2007, 8: 120-10.1186/1471-2105-8-120.
Wilkinson M, McInerney JO, Hirt RP, Foster PG, Embley TM: Of clades and clans: terms for phylogenetic relationships in unrooted trees. Trends Ecol Evol. 2007, 22: 114-115. 10.1016/j.tree.2007.01.002.
Kelly DP, Wood AP: Reclassification of some species of Thiobacillus to the newly designated genera Acidithiobacillus gen. nov., Halothiobacillus gen. nov. and Thermithiobacillus gen. nov. Int J Syst Evol Microbiol. 2000, 50: 511-516. 10.1099/00207713-50-2-511.
Hallberg KB, Johnson DB: Biodiversity of acidophilic prokaryotes. Adv Appl Microbiol. 2001, 49: 37-84.
Robertson LA, Kuenen JG: The genus Thiobacillus. Prokaryotes. 2006, 5: 812-827.
Moreira D, Amils R: Phylogeny of Thiobacillus cuprinus and other mixotrophic thiobacilli: proposal for Thiomonas gen. nov. Int J Syst Bacteriol. 1997, 47: 522-528. 10.1099/00207713-47-2-522.
Swidsinski A, Weber J, Loening-Baucke V, Hale LP, Lochs H: Spatial organization and composition of the mucosal flora in patients with inflammatory bowel disease. J Clin Microbiol. 2005, 43: 3380-3389. 10.1128/JCM.43.7.3380-3389.2005.
Voha C, Docquier JD, Rossolini GM, Fosse T: Genetic and biochemical characterization of FUS-1 (OXA-85), a narrow-spectrum class D beta-lactamase from Fusobacterium nucleatum subsp. polymorphum. Antimicrob Agents Chemother. 2006, 50: 2673-2679. 10.1128/AAC.00058-06.
Matson EG, Thompson MG, Humphrey SB, Zuerner RL, Stanton TB: Identification of genes of VSH-1, a prophage-like gene transfer agent of Brachyspira hyodysenteriae. J Bacteriol. 2005, 187: 5885-5892. 10.1128/JB.187.17.5885-5892.2005.
Etchebehere C, Pavan ME, Zorzópulos J, Soubes M, Muxí L: Coprothermobacter platensis sp. nov., a new anaerobic proteolytic thermophilic bacterium isolated from an anaerobic mesophilic sludge. Int J Syst Bacteriol. 1998, 48: 1297-1304. 10.1099/00207713-48-4-1297.
Holloway C, Beiko RG: Assembling networks of microbial genomes using linear programming. BMC Evol Biol. 2010, 10: 360-10.1186/1471-2148-10-360.
Lima-Mendez G, Van Helden J, Toussaint A, Leplae R: Reticulate representation of evolutionary and functional relationships between phage genomes. Mol Biol Evol. 2008, 25: 762-777. 10.1093/molbev/msn023.
Halary S, Leigh JW, Cheaib B, Lopez P, Bapteste E: Network analyses structure genetic diversity in independent genetic worlds. Proc Natl Acad Sci USA. 2010, 107: 127-132. 10.1073/pnas.0908978107.
Popa O, Hazkani-Covo E, Landan G, Martin W, Dagan T: Directed networks reveal genomic barriers and DNA repair bypasses to lateral gene transfer among prokaryotes. Genome Res. 2011
Dagan T, Artzy-Randrup Y, Martin W: Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution. Proc Natl Acad Sci USA. 2008, 105: 10039-44. 10.1073/pnas.0800679105.
Cavalier-Smith T: The neomuran origin of archaebacteria, the negibacterial root of the universal tree and bacterial megaclassification. Int J Syst Evol Microbiol. 2002, 52: 7-76.
Griffiths E, Gupta RS: Signature sequences in diverse proteins provide evidence for the late divergence of the Order Aquificales. Int Microbiol. 2004, 7: 41-52.
Boussau B, Guéguen L, Gouy M: Accounting for horizontal gene transfers explains conflicting hypotheses regarding the position of aquificales in the phylogeny of Bacteria. BMC Evol Biol. 2008, 8: 272-10.1186/1471-2148-8-272.
Jin G, Nakhleh L, Snir S, Tuller T: Efficient parsimony-based methods for phylogenetic network reconstruction. Bioinformatics. 2007, 23: e123-128. 10.1093/bioinformatics/btl313.
Huson DH, Scornavacca C: A survey of combinatorial methods for phylogenetic networks. Genome Biol Evol. 2011, 3: 23-35. 10.1093/gbe/evq077.
Kunin V, Goldovsky L, Darzentas N, Ouzounis CA: The net of life: reconstructing the microbial phylogenetic network. Genome Res. 2005, 15: 954-9. 10.1101/gr.3666505.
Huson DH, Rupp R, Berry V, Gambette P, Paul C: Computing galled networks from real data. Bioinformatics. 2009, 15: i85-93.
Hallett MT, Lagergren J: Efficient algorithms for lateral gene transfer problems. Proceedings of the 5th annual international conference on computational molecular biology (RECOMB01). Edited by: Lengauer T. 2001, New York: ACM Press, 149-156.
Beiko RG, Hamilton N: Phylogenetic identification of lateral genetic transfer events. BMC Evol Biol. 2006, 6: 15-10.1186/1471-2148-6-15.
Park HJ, Jin G, Nakhleh L: Bootstrap-based Support of HGT Inferred by Maximum Parsimony. BMC Evol Biol. 10: 131-
Huson DH, Richter DC, Rausch C, Dezulian T, Franz M, Rupp R: Dendroscope: An interactive viewer for large phylogenetic trees. BMC Bioinformatics. 2007, 8: 460-10.1186/1471-2105-8-460.
Susko E, Leigh J, Doolittle WF, Bapteste E: Visualizing and assessing phylogenetic congruence of core gene sets: a case study of the gamma-proteobacteria. Mol Evol Biol. 2006, 23: 1019-1030. 10.1093/molbev/msj113.
Bapteste E, Susko E, Leigh J, Ruiz-Trillo I, Bucknam J, Doolittle WF: Alternative methods for concatenation of core genes indicate a lack of resolution in deep nodes of the prokaryotic phylogeny. Mol Biol Evol. 2008, 25: 83-91.
Leigh JW, Susko E, Baumgartner M, Roger AJ: Testing congruence in phylogenomic analysis. Syst Biol. 2008, 57: 104-115. 10.1080/10635150801910436.
Zhaxybayeva O, Swithers KS, Lapierre P, Fournier GP, Bickhart DM, DeBoy RT, Nelson KE, Nesbø CL, Doolittle WF, Gogarten JP, Noll KM: On the chimeric nature, thermophilic origin, and phylogenetic placement of the Thermotogales. Proc Natl Acad Sci USA. 2009, 106: 5865-70. 10.1073/pnas.0901260106.
Dagan T, Roettger M, Bryant D, Martin W: Genome networks root the tree of life between prokaryotic domains. Genome Biol Evol. 2010, 2: 379-392. 10.1093/gbe/evq025.
He M, Sebaihia M, Lawley TD, Stabler RA, Dawson LF, Martin MJ, Holt KE, Seth-Smith HM, Quail MA, Rance R, Brooks K, Churcher C, Harris D, Bentley SD, Burrows C, Clark L, Corton C, Murray V, Rose G, Thurston S, van Tonder A, Walker D, Wren BW, Dougan G, Parkhill J: Evolutionary dynamics of Clostridium difficile over short and long time scales. Proc Natl Acad Sci USA. 2010, 107: 7527-32. 10.1073/pnas.0914322107.
Smith TF, Waterman MS: Identification of common molecular subsequences. J Mol Biol. 147: 195-197.
Lipman DJ, Altschul SF, Kececioglu JD: A tool for multiple sequence alignment. Proc Natl Acad Sci USA. 1989, 86: 4412-4415. 10.1073/pnas.86.12.4412.
Altenhoff AM, Schneider A, Gonnet GH, Dessimoz C: OMA 2011:orthology inference among 1000 complete genomes. Nucleic Acids Res. 2011, 39: D289-94. 10.1093/nar/gkq1238.
Uchiyama I, Higuchi T, Kawai M: MBGD update 2010:toward a comprehensive resource for exploring microbial genome diversity. Nucleic Acids Res. 2010, 38: D361-5. 10.1093/nar/gkp948.
Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ, Bork P: eggNOG v2.0:extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucleic Acids Res. 2010, 38: D190-5. 10.1093/nar/gkp951.
Chen F, Mackey AJ, Stoeckert CJ, Roos DS: OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res. 2006, 34: D363-8. 10.1093/nar/gkj123.
Munzner T, Guimbretiere F, Tasiran S, Zhang L, Zhou Y: TreeJuxtaposer: Scalable Tree Comparison using Focus+Context with Guaranteed Visibility. ACM Transactions on Graphics. 2003, 22: 453-462. 10.1145/882262.882291.
Card SK, Mackinlay JD, Shneiderman B: Readings in Information Visualization: Using Vision to Think. 1999, San Francisco: Morgan Kaufmann Publishers
Clarke GD, Beiko RG, Ragan MA, Charlebois RL: Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores. J Bacteriol. 2002, 184: 2072-2080. 10.1128/JB.184.8.2072-2080.2002.
Desper R, Gascuel O: Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J Comput Biol. 2002, 9: 687-705. 10.1089/106652702761034136.
Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32: 1792-1797. 10.1093/nar/gkh340.
Durbin R, Eddy S, Krogh A, Mitchison G: Biological sequence analysis: probabilistic models of proteins and nucleic acids. 1998, Cambridge University Press
Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, de Hoon MJ: Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009, 25: 1422-1423. 10.1093/bioinformatics/btp163.
Huson DH, Dezulian T, Klöpper T, Steel MA: Phylogenetic super-networks from partial trees. IEEE/ACM Trans Comput Biol Bioinform. 2004, 1: 151-8. 10.1109/TCBB.2004.44.