TRII: A Probabilistic Scoring of Drosophila melanogaster Translation Initiation Sites
Tóm tắt
Relative individual information is a measurement that scores the quality of DNA- and RNA-binding sites for biological machines. The development of analytical approaches to increase the power of this scoring method will improve its utility in evaluating the functions of motifs. In this study, the scoring method was applied to potential translation initiation sites in Drosophila to compute Translation Relative Individual Information (TRII) scores. The weight matrix at the core of the scoring method was optimized based on high-confidence translation initiation sites identified by using a progressive partitioning approach. Comparing the distributions of TRII scores for sites of interest with those for high-confidence translation initiation sites and random sequences provides a new methodology for assessing the quality of translation initiation sites. The optimized weight matrices can also be used to describe the consensus at translation initiation sites, providing a quantitative measure of preferred and avoided nucleotides at each position.
Tài liệu tham khảo
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS: MEME Suite: tools for motif discovery and searching. Nucleic Acids Research 2009, 37(2):W202-W208.
Stephens RM, Schneider TD: Features of spliceosome evolution and function inferred from an analysis of the information at human splice sites. Journal of Molecular Biology 1992, 228(4):1124-1136. 10.1016/0022-2836(92)90320-J
Weir M, Eaton M, Rice M: Challenging the spliceosome machine. Genome Biology 2006., 7(1, article R3):
Weir M, Rice M: Ordered partitioning reveals extended splice-site consensus information. Genome Research 2004, 14(1):67-78.
Burge C, Karlin S: Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology 1997, 268(1):78-94. 10.1006/jmbi.1997.0951
Shannon CE, Weaver W: The Mathematical Theory of Communication. University of Illinois Press, Urbanam, Ill, USA; 1949.
Schneider TD, Spouge J: Information content of individual genetic sequences. Journal of Theoretical Biology 1997, 189(4):427-441. 10.1006/jtbi.1997.0540
Miyasaka H: The positive relationship between codon usage bias and translation initiation AUG context in Saccharomyces cerevisiae. Yeast 1999, 15(8):633-637. 10.1002/(SICI)1097-0061(19990615)15:8<633::AID-YEA407>3.0.CO;2-O
Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS: Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 2009, 324(5924):218-223. 10.1126/science.1168978
BDGP Berkeley Drosophila Genome Project, 2002
Stapleton M, Carlson J, Brokstein P, Yu C, Champe M, George R, Guarin H, Kronmiller B, Pacleb J, Park S, Wan K, Rubin GM, Celniker SE: A Drosophila full-length cDNA resource. Genome Biology 2002, 3(12):research0080.1-research0080.8. 10.1186/gb-2002-3-12-research0080
Stapleton M, Liao G, Brokstein P, Hong L, Carninci P, Shiraki T, Hayashizaki Y, Champe M, Pacleb J, Wan K, Yu C, Carlson J, George R, Celniker S, Rubin GM: The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes. Genome Research 2002, 12(8):1294-1300. 10.1101/gr.269102
Rogozin IB, Kochetov AV, Kondrashov FA, Koonin EV, Milanesi L:Presence of ATG triplets in untranslated regions of eukaryotic cDNAs correlates with a 'weak' context of the start codon. Bioinformatics 2001, 17(10):890-900. 10.1093/bioinformatics/17.10.890
Hinnebusch AG, Jackson BM, Mueller PP: Evidence for regulation of reinitiation in translational control of GCN4 mRNA. Proceedings of the National Academy of Sciences of the United States of America 1988, 85(19):7279-7283. 10.1073/pnas.85.19.7279
Kochetov AV: Alternative translation start sites and hidden coding potential of eukaryotic mRNAs. BioEssays 2008, 30(7):683-691. 10.1002/bies.20771
Kozak M: Constraints on reinitiation of translation in mammals. Nucleic Acids Research 2001, 29(24):5226-5232. 10.1093/nar/29.24.5226
Ghaemmaghami S, Huh W-K, Bower K, Howson RW, Belle A, Dephoure N, O'Shea EK, Weissman JS: Global analysis of protein expression in yeast. Nature 2003, 425(6959):737-741. 10.1038/nature02046
Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS: Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 2009, 324(5924):218-223. 10.1126/science.1168978
Clark AG, Eisen MB, Smith DR, et al., Evolution of genes and genomes on the Drosophila phylogeny. Nature 2007, 450(7167):203-218. 10.1038/nature06341
Lin MF, Carlson JW, Crosby MA, et l.,: Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. Genome Research 2007, 17(12):1823-1836. 10.1101/gr.6679507
Stark A, Lin MF, Kheradpour P, et al.,: Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 2007, 450(7167):219-232. 10.1038/nature06340
Kozak M: Regulation of translation via mRNA structure in prokaryotes and eukaryotes. Gene 2005, 361(1-2):13-37.
Kozak M: Initiation of translation in prokaryotes and eukaryotes. Gene 1999, 234(2):187-208. 10.1016/S0378-1119(99)00210-3
Kozak M: A progress report on translational control in eukaryotes. Science's STKE 2001, 2001(71):pe1.
Shultzaberger RK, Roberts LR, Lyakhov IG, Sidorov IA, Stephen AG, Fisher RJ, Schneider TD: Correlation between binding rate constants and individual information of E. coli Fis binding sites. Nucleic Acids Research 2007, 35(16):5275-5283. 10.1093/nar/gkm471
Cavener DR: Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates. Nucleic Acids Research 1987, 15(4):1353-1361. 10.1093/nar/15.4.1353
Cavener DR, Ray SC: Eukaryotic start and stop translation sites. Nucleic Acids Research 1991, 19(12):3185-3192. 10.1093/nar/19.12.3185
Feng Y, Gunter LE, Organ EL, Cavener DR: Translation initiation in Drosophila melanogaster is reduced by mutations upstream of the AUG initiator codon. Molecular and Cellular Biology 1991, 11(4):2149-2153.
Kozak M:An analysis of -noncoding sequences from 699 vertebrate messenger RNAs. Nucleic Acids Research 1987, 15(20):8125-8148. 10.1093/nar/15.20.8125
Yin C, Yau SS-T: A Fourier characteristic of coding sequences: origins and a non-Fourier approximation. Journal of Computational Biology 2005, 12(9):1153-1165. 10.1089/cmb.2005.12.1153
Fickett JW: Recognition of protein coding regions in DNA sequences. Nucleic Acids Research 1982, 10(17):5303-5318. 10.1093/nar/10.17.5303
Gadiraju S, Vyhlidal CA, Leeder JS, Rogan PK: Genome-wide prediction, display and refinement of binding sites with information theory-based models. BMC Bioinformatics 2003., 4, article 38:
Schneider TD: Sequence walkers: a graphical method to display how binding proteins interact with DNA or RNA sequences. Nucleic Acids Research 1997, 25(21):4408-4415. 10.1093/nar/25.21.4408
Schneider TD: Consensus sequence Zen. Appl Bioinformatics 2002, 1(3):111-119.