Relating underrepresented genomic DNA patterns and tiRNAs: the rule behind the observation and beyond

Springer Science and Business Media LLC - Tập 5 - Trang 1-16 - 2010
Miklos Cserzo1, Gabor Turu1, Peter Varnai1, Laszlo Hunyady1
1Department of Physiology, Semmelweis University, Hungary, EU

Tóm tắt

One of the central problems of post-genomic biology is the understanding of regulatory network of genes. Traditionally the problem is approached from the protein-DNA interaction perspective. In recent years various types of noncoding RNAs appeared on the scene as new potent players of the game. The exact role of these molecules in gene expression control is mostly unknown at present, while their importance is generally recognized. The Human and Mouse genomes have been screened with a statistical model for sequence patterns underrepresented in these genomes, and a subset of motifs, named spanions, has been identified. The common portion of the motif lists of the two species is 75% indicating evolutionary conservation of this feature. These motifs are arranged in clusters at close proximity of distinct genetic landmarks: 5' ends of genes, exon side of the exon/intron junctions and 5' ends of 3' UTRs. The length of the clusters is typically in the 20 to 25 bases range. The findings are in agreement with the known C/G bias of promoter regions while access much more sequential information than the simple composition based model. In the Human genome the recently reported transcription initiation RNAs (tiRNAs) are typically transcribed from these spanion clusters according to the presented results. The spanion clusters account for 70% of the published tiRNAs. Apparently, the model access the common statistical feature of this new and mostly uncharacterized non-coding RNA class and, in this way, supports the experimental observations with theoretical background. The presented results seem to support the emerging model of the RNA-driven eukaryotic gene expression control. Beyond that, the model detects spanion clusters at genetic positions where no tiRNA counterpart was considered and reported. The GO-term analysis of genes with high concentration of spanion clusters in their promoter proximal region indicates involvement in gene regulatory processes. The results of the analysis suggest that the gene regulatory potential of the small non-coding RNAs is grossly underestimated at present. This article was reviewed by Frank Eisenhaber, Sandor Pongor and Rotem Sorek (nominated by Doron Lancet).

Tài liệu tham khảo

Krishnamurthy S, Hampsey M: Eukaryotic transcription initiation. Curr Biol. 2009, 19: R153-156. 10.1016/j.cub.2008.11.052. Mattick JS: Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms. Bioessays. 2003, 25: 930-939. 10.1002/bies.10332. Mattick JS: A new paradigm for developmental biology. J Exp Biol. 2007, 210: 1526-1547. 10.1242/jeb.005017. Borel C, Gagnebin M, Gehrig C, Kriventseva EV, Zdobnov EM, Antonarakis SE: Mapping of small RNAs in the human ENCODE regions. Am J Hum Genet. 2008, 82: 971-981. 10.1016/j.ajhg.2008.02.016. Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, Stadler PF, Hertel J, Hackermuller J, Hofacker IL, et al: RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007, 316: 1484-1488. 10.1126/science.1138341. Chekanova JA, Gregory BD, Reverdatto SV, Chen H, Kumar R, Hooker T, Yazaki J, Li P, Skiba N, Peng Q, et al: Genome-wide high-resolution mapping of exosome substrates reveals hidden features in the Arabidopsis transcriptome. Cell. 2007, 131: 1340-1353. 10.1016/j.cell.2007.10.056. Taft RJ, Glazov EA, Cloonan N, Simons C, Stephen S, Faulkner GJ, Lassmann T, Forrest AR, Grimmond SM, Schroder K, et al: Tiny RNAs associated with transcription start sites in animals. Nat Genet. 2009, 41: 572-578. 10.1038/ng.312. Berretta J, Morillon A: Pervasive transcription constitutes a new level of eukaryotic genome regulation. EMBO Rep. 2009, 10: 973-982. 10.1038/embor.2009.181. Taft RJ, Kaplan CD, Simons C, Mattick JS: Evolution, biogenesis and function of promoter-associated RNAs. Cell Cycle. 2009, 8: 2332-2338. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062. Cohen N, Dagan T, Stone L, Graur D: GC composition of the human genome: in search of isochores. Mol Biol Evol. 2005, 22: 1260-1272. 10.1093/molbev/msi115. Glass JL, Thompson RF, Khulan B, Figueroa ME, Olivier EN, Oakley EJ, Van Zant G, Bouhassira EE, Melnick A, Golden A, et al: CG dinucleotide clustering is a species-specific property of the genome. Nucleic Acids Res. 2007, 35: 6798-6807. 10.1093/nar/gkm489. Nekrutenko A, Li WH: Assessment of compositional heterogeneity within and between eukaryotic genomes. Genome Res. 2000, 10: 1986-1995. 10.1101/gr.10.12.1986. Schmegner C, Hameister H, Vogel W, Assum G: Isochores and replication time zones: a perfect match. Cytogenet Genome Res. 2007, 116: 167-172. 10.1159/000098182. Takai D, Jones PA: Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci USA. 2002, 99: 3740-3745. 10.1073/pnas.052410099. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, et al: A Global View of Gene Activity and Alternative Splicing by Deep Sequencing of the Human Transcriptome. Science. 2008, 321: 956-960. 10.1126/science.1160342. Consortium GO: The Gene Ontology project in 2008. Nucleic Acids Res. 2008, 36: D440-444. 10.1093/nar/gkm883. Curwen V, Eyras E, Andrews TD, Clarke L, Mongin E, Searle SM, Clamp M: The Ensembl automatic gene annotation system. Genome Res. 2004, 14: 942-950. 10.1101/gr.1858004. Hubbard TJ, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, et al: Ensembl 2007. Nucleic Acids Res. 2007, 35: D610-617. 10.1093/nar/gkl996.