Alignment-free phylogeny of whole genomes using underlying subwords
Tóm tắt
Từ khóa
Tài liệu tham khảo
Wildman D, Uddin M, Opazo JC, Liu G, Lefort V, Guindon S, Gascuel O, Grossman LI, Romero R, Goodman M: Genomics, biogeography, and the diversification of placental mammals. Proc Natl Acad Sci USA. 2007, 104: 14395-14400. 10.1073/pnas.0704342104
Huynen M, Bork P: Measuring genome evolution. Proc Natl Acad Sci USA. 1998, 95: 5849-5856. 10.1073/pnas.95.11.5849
Chor B, Horn D, Goldman N, Levy Y, Massingham T: Genomic DNA k-mer spectra: models and modalities. Genome Biol. 2009, 10 (10): R108. 10.1186/gb-2009-10-10-r108
Sims GE, Jun SRR, Wu GA, Kim SH: Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc Nat Acad Sci USA. 2009, 106 (8): 2677-2682. 10.1073/pnas.0813249106
Delsuc F, Brinkmann H, Philippe H: Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet. 2005, 6: 361-375.
Ulitsky I, Burstein D, Tuller T, Chor B: The average common substring approach to phylogenomic reconstruction. J Comput Biol. 2006, 13 (2): 336-350. 10.1089/cmb.2006.13.336
Sims GE, Jun SR, Wu GA, Kim SH: Whole-genome phylogeny of mammals: Evolutionary information in genic and nongenic regions. Proc Nat Acad Sci USA. 2009, 106 (40): 17077-17082. 10.1073/pnas.0909377106
Lin J: Divergence measures based on the Shannon entropy. IEEE T Inform Theory. 1991, 37: 145-151. 10.1109/18.61115
Apostolico A, Comin M, Parida L: Mining, compressing and classifying with extensible motifs. Algorithms Mol Biol. 2006, 1: 4. 10.1186/1748-7188-1-4
Apostolico A, Comin M, Parida L: Motifs in Ziv-Lempel-Welch Clef. Proceedings of IEEE DCC Data Compression Conference. IEEE Computer Society, 2004, 72-81.
Giancarlo R, Scaturro D, Utro F: Textual data compression in computational biology: a synopsis. Bioinformatics. 2009, 25 (13): 1575-1586. 10.1093/bioinformatics/btp117
Iliopoulos C, Mchugh J, Peterlongo P, Pisanti N, Rytter W, Sagot MF: A first approach to finding common motifs with gaps. Int J Foundations Comput Sci. 2005, 16 (6): 1145-1154. 10.1142/S0129054105003716
Apostolico A, Comin M, Parida L: Conservative extraction of over-represented extensible motifs. Bioinformatics. 2005, 21 (Suppl 1): i9-i18. 10.1093/bioinformatics/bti1051
Apostolico A, Comin M, Parida L: VARUN: discovering extensible motifs under saturation constraints. IEEE/ACM Trans Comput Biol Bioinformatics. 2010, 7 (4): 752-762.
Kong SG, Fan WL, Chen HD, Hsu ZT, Zhou N, Zheng B, Lee HC: Inverse symmetry in complete genomes and whole-genome inverse duplication. PLoS ONE. 2009, 4 (11): e7553. 10.1371/journal.pone.0007553
Gusfield D: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. New York, USA: Cambridge University Press, 1997.
Apostolico A: The myriad virtues of subword trees. Combinatorial Algorithms on Words, A. Apostolico, Z. Galil (Eds.). 1985, 12: 85-96.
Apostolico A: Maximal words in sequence comparisons based on subword composition. Algorithms and Applications, Volume 6060 of Lecture Notes in Computer Science. Edited by: Elomaa T, Mannila H, Orponen P. Berlin: Springer-Verlag, 2010, 34-44.
Apostolico A, Parida L: Incremental paradigms of motif discovery. J Comput Biol. 2004, 11: 15-25. 10.1089/106652704773416867
Comin M, Verzotto D: Classification of protein sequences by means of irredundant patterns. BMC Bioinformatics. 2010, 11 (Suppl. 1): S16.
Comin M, Verzotto D: The Irredundant Class method for remote homology detection of protein sequences. J Comput Biol. 2011, 18 (12): 1819-1829. [ http://dx.doi.org/10.1089/cmb.2010.0171 ].] 10.1089/cmb.2010.0171
Apostolico A, Comin M, Parida L: Bridging lossy and lossless compression by motif pattern discovery. Lect Notes Comput Sci. 2006, 4123: 793-813. 10.1007/11889342_51
Comin M, Parida L: Detection of subtle variations as consensus motifs. Theor Comput Sci. 2008, 395 (2-3): 158-170. 10.1016/j.tcs.2008.01.017
Ukkonen E: Maximal and minimal representations of gapped and non-gapped motifs of a string. Theor Comput Sci. 2009, 410 (43): 4341-4349. 10.1016/j.tcs.2009.07.015
Comin M, Verzotto D: Comparing, ranking and filtering motifs with character classes: application to biological sequences analysis. Biological Knowledge Discovery Handbook: Preprocessing, Mining and Postprocessing of Biological Data. Edited by: Elloumi M, Zomaya AY. 2013, chapter 13-chapter 13. Wiley.
Cormen TH, Leiserson CE, Rivest RL: Introduction To Algorithms, chap. 9. MIT Press, 1990, 178–180.
Kopelowitz T, Lewenstein M: Dynamic weighted ancestors. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007). Society for Industrial and Applied Mathematics Philadelphia SIAM, 2007, 565-574.
Smith GJD, Vijaykrishna D, Bahl J, Lycett SJ, Worobey M, Pybus OG, Ma SK, Cheung CL, Raghwani J, Bhatt S, Peiris JSM, Guan Y, Rambaut A: Origins and evolutionary genomics of the 2009 swine-origin H1N1 Influenza A epidemic. Nature. Nature Publishing Group. 2009, 459 (7250): 1122-1125.
Shiino T, Okabe N, Yasui Y, Sunagawa T, Ujike M, Obuchi M, Kishida N, Xu H, Takashita E, Anraku A, Ito R, Doi T, Ejima M, Sugawara H, Horikawa H, Yamazaki S, Kato Y, Oguchi A, Fujita N, Odagiri T, Tashiro M, Watanabe H: Molecular Evolutionary Analysis of the Influenza A(H1N1)pdm, May–September, 2009: Temporal and Spatial Spreading Profile of the Viruses in Japan. PLoS ONE. 2010, 5 (6): e11057. 10.1371/journal.pone.0011057
Thompson J, Higgins D, Gibson T: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673
Felsenstein J: PHYLIP – Phylogeny Inference Package (Version 3.2). Cladistics. 1989, 5: 164-166.
Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, Tiedje JM: The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 2009, 37: D141-D145. 10.1093/nar/gkn879
Martinsen ES, Perkins SL, Schall JJ: A three-genome phylogeny of malaria parasites (Plasmodium and closely related genera): Evolution of life-history traits and host switches. Mol Phylogenet Evol. 2008, 47: 261-273. 10.1016/j.ympev.2007.11.012