Identification of Protein Coding Regions of Rice Genes Using Alternative Spectral Rotation Measure and Linear Discriminant Analysis

Genomics, Proteomics & Bioinformatics - Tập 2 - Trang 167-173 - 2004
Jiao Jin1,2
1Department of Statistics and Financial Mathematics, School of Mathematical Sciences, Beijing Normal University, Beijing 100875
2Beijing Genomics Institute, Beijing 101300, China

Tài liệu tham khảo

Staden, 1982, Codon preference and its use in identifying protein coding regions in long DNA sequences, Nucleic Acids Res., 10, 141, 10.1093/nar/10.1.141 Farber, 1992, Determination of eukaryotic protein coding regions using neural networks and information theory, J. Mol. Biol., 226, 471, 10.1016/0022-2836(92)90961-I Tiwari, 1997, Prediction of probable genes by Fourier analysis of genomic sequences, Comput. Appl. Biosci., 113, 263 Anastassiou, 2000, Frequency-domain analysis of biomolecular sequences, Bioinformatics, 16, 1073, 10.1093/bioinformatics/16.12.1073 Kotlar, 2003, Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions, Genome Res., 13, 1930, 10.1101/gr.1261703 Fickett, 1992, Assessment of protein coding measures, Nucleic Acids Res., 20, 6441, 10.1093/nar/20.24.6441 Fickett, 1996, The gene identification problem: an overview for developers, Comput. Chem., 20, 103, 10.1016/S0097-8485(96)80012-X Zhang, 1997, Identification of protein coding regions in the human genome by quadratic discriminant analysis, Proc. Natl. Acad. Sci. USA, 94, 565, 10.1073/pnas.94.2.565 Salzberg, 1998, Microbial gene identification using interpolated Markov models, Nucleic Acids Res., 26, 544, 10.1093/nar/26.2.544 Salzberg, 1998, A decision tree system for finding genes in DNA, J. Mol. Biol., 5, 667 Lukashin, 1998, GeneMark.hmm: new solutions for gene finding, Nucleic Acids Res., 26, 1107, 10.1093/nar/26.4.1107 Burge, 1997, Prediction of complete gene structures in human genomic DNA, J. Mol. Biol., 268, 78, 10.1006/jmbi.1997.0951 Salamov, 2000, Ab initio gene finding in Drosophila genomic DNA, Genome Res., 10, 516, 10.1101/gr.10.4.516 Li, 1999, Statistical properties of open reading frames in complete genome sequences, Comput. Chem., 23, 283, 10.1016/S0097-8485(99)00014-5 Zhang, 2000, Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve, Nucleic Acids Res., 28, 2804, 10.1093/nar/28.14.2804 Wang, 2002, Recognizing shorter coding regions of human genes based on the statistics of stop codons, Biopolymers., 63, 207, 10.1002/bip.10054 Thanaraj, 2000, Positional characterisation of false positives from computational prediction of human splice sites, Nucleic Acids Res., 28, 744, 10.1093/nar/28.3.744 Oppenheim, 1999 H., Li, et al. Test data sets and evaluation of gene prediction programs on the rice genome. J. Comput. Sci. Tech. In press.