Skittle: A 2-Dimensional Genome Visualization Tool
Tóm tắt
It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle. This program first creates a 2-dimensional nucleotide display by assigning four colors to the four nucleotides, and then text-wraps to a user adjustable width. This nucleotide display is accompanied by a "repeat map" which comprehensively displays all local repeating units, based upon analysis of all possible local alignments. Skittle includes a smooth-zooming interface which allows the user to analyze genomic patterns at any scale. Skittle is especially useful in identifying and analyzing tandem repeats, including repeats not normally detectable by other methods. However, Skittle is also more generally useful for analysis of any genomic data, allowing users to correlate published annotations and observable visual patterns, and allowing for sequence and construct quality control. Preliminary observations using Skittle reveal intriguing genomic patterns not otherwise obvious, including structured variations inside tandem repeats. The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus.
Tài liệu tham khảo
citation_journal_title=Nature; citation_title=Encode Project Consortium: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project; citation_author=null Birney; citation_volume=447; citation_publication_date=2007; citation_pages=799-816; citation_doi=10.1038/nature05874; citation_id=CR1
citation_journal_title=Bull of Mathematical Biology; citation_title=Multiple codes of nucleotide sequences; citation_author=EN Trifonov; citation_volume=51; citation_publication_date=1989; citation_pages=417-432; citation_doi=10.1007/BF02460081; citation_id=CR2
citation_journal_title=Physica A; citation_title=3-, 10.5-, 200- and 400-base periodicities in genome sequences; citation_author=EN Trifonov; citation_volume=249; citation_publication_date=1998; citation_pages=511-516; citation_doi=10.1016/S0378-4371(97)00510-4; citation_id=CR3
citation_journal_title=FEBS Lett; citation_title=A simple model to explain three-base periodicity in coding DNA; citation_author=J Sánchez, I López-Villaseñor; citation_volume=580; citation_publication_date=2006; citation_pages=6413-6422; citation_doi=10.1016/j.febslet.2006.10.056; citation_id=CR4
citation_journal_title=Biochem Biophys Res Commun; citation_title=Three-base periodicity patterns and self-similarity in whole bacterial chromosomes; citation_author=I López-Villaseñor, MV José, J Sánchez; citation_volume=325; citation_publication_date=2004; citation_pages=467-478; citation_doi=10.1016/j.bbrc.2004.10.053; citation_id=CR5
citation_journal_title=J Mol Biol; citation_title=Periodicities of 10-11 bp as indicators of the supercoiled state of genomic DNA; citation_author=P Schieg, H Herzel; citation_volume=343; citation_publication_date=2004; citation_pages=891-901; citation_doi=10.1016/j.jmb.2004.08.068; citation_id=CR6
citation_journal_title=EURASIP J Appl Signal Processing; citation_title=Spectrogram analysis of genomes; citation_author=D Sussillo, A Kundaje, D Anastassiou; citation_volume=1; citation_publication_date=2004; citation_pages=29-42; citation_doi=10.1155/S1110865704310048; citation_id=CR7
DNA Rainbow[
http://www.dna-rainbow.org/
]
citation_journal_title=Nucleic Acids Res; citation_title=Tandem Repeat Finder: a program to analyze DNA sequences; citation_author=G Benson; citation_volume=27; citation_publication_date=1999; citation_pages=573-580; citation_doi=10.1093/nar/27.2.573; citation_id=CR9
citation_title=Computation and visualization of degenerate repeats in complete genomes; citation_inbook_title=Proc Of the International Conference on Intelligent Systems for Molecular Biology (ISMB 2000); citation_publication_date=2000; citation_pages=228-238; citation_id=CR10; citation_author=S Kurtz; citation_author=E Ohlebusch; citation_author=C Schleiermacher; citation_author=J Stoye; citation_author=R Giegerich
Mouse/Human Annotation Collaboration: Submission Format[
http://mblab.wustl.edu/GTF2.html
]
citation_inbook_title=Finding edges and lines in images; citation_publication_date=1983; citation_id=CR12; citation_author=JF Canny; citation_publisher=M.I.T. Artificial Intelligence Lab TR-720
citation_journal_title=Genomic Research; citation_title=An isochore map of human chromosomes; citation_author=M Costantini, O Clay, F Auletta, G Bernardi; citation_volume=16; citation_publication_date=2006; citation_pages=536-541; citation_doi=10.1101/gr.4910606; citation_id=CR13
UCSC Genome Browser Downloads: Repeat Masked Human Feb. 2009[
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFaMasked.tar.gz
]
citation_journal_title=TRENDS Genet; citation_title=Analysis of the centromeric regions of the human genome assembly; citation_author=MK Rudd, HF Willard; citation_volume=20; citation_publication_date=2004; citation_pages=529-533; citation_doi=10.1016/j.tig.2004.08.008; citation_id=CR15
citation_journal_title=Nature; citation_title=The structure and evolution of centromeric transition regions within the human genome; citation_author=X She, JE Horvath, Z Jiang, G Liu, TS Furey, L Christ, R Clark, T Graves, CL Gulden, C Alkan, JA Bailey, C Sahinalp, M Rocchi, D Haussler, RK Wilson, W Miller, S Schwartz, EE Eichler; citation_volume=430; citation_publication_date=2004; citation_pages=857-864; citation_doi=10.1038/nature02806; citation_id=CR16
citation_journal_title=Genome Res; citation_title=ENCODE: More genomic empowerment; citation_author=GM Weinstock; citation_volume=17; citation_publication_date=2007; citation_pages=667-668; citation_doi=10.1101/gr.6534207; citation_id=CR17
citation_journal_title=Genome Res; citation_title=What is a gene, post-ENCODE? History and updated definition; citation_author=MB Gerstein, C Bruce, JS Rozowsky, D Zheng, J Du, JO Korbel, O Emanuelsson, ZD Zhang, S Weissman, M Snyder; citation_volume=17; citation_publication_date=2007; citation_pages=669-681; citation_doi=10.1101/gr.6339607; citation_id=CR18
citation_journal_title=Science; citation_title=Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome; citation_author=E Lieberman-Aiden, N Berkum, L Williams, M Imakaev, T Ragoczy, A Telling, I Amit, BR Lajoie, PJ Sabo, MO Dorschner, R Sandstrom, B Bernstein, MA Bender, M Groudine, A Gnirke, J Stamatoyannopoulos, LA Mirny, ES Lander, J Dekker; citation_volume=9; citation_publication_date=2009; citation_pages=289-293; citation_doi=10.1126/science.1181369; citation_id=CR19
citation_journal_title=J Phys Chem B Letters; citation_title=DNA double helices recognize mutual sequence homology in a protein free environment; citation_author=GS Baldwin, NJ Brooks, RE Robson, A Wynveen, A Goldar, S Leikin, JM Seddon, AA Kornyshev; citation_volume=112; citation_publication_date=2008; citation_pages=1060-1064; citation_doi=10.1021/jp7112297; citation_id=CR20