Variations of the Mononucleotide and Short Oligonucleotide Distributions in the Genomes of Various Organisms

Journal of Theoretical Biology - Tập 201 - Trang 141-156 - 1999
DAVID HÄRING1, JAROSLAV KYPR1
1Institute of Biophysics, Academy of Sciences of the Czech Republic, Královopolská 135, CZ-61265 Brno, Czech Republic

Tài liệu tham khảo

ADAMS, 1986, CpG deficiency, dinucleotide distributions and nucleosome positioning, Eur. J. Biochem., 165, 107, 10.1111/j.1432-1033.1987.tb11200.x BARRAI, 1990, Oligonucleotide correlations between infector and host genomes hint at evolutionary relationships, Nucl. Acids Res., 18, 3021, 10.1093/nar/18.10.3021 BECKMANN, 1986, Intervening sequences exhibit distinct vocabulary, J. Biomol. Struct. Dyn., 4, 391, 10.1080/07391102.1986.10506357 BERNARDI, 1995, The human genome: organization and evolutionary history, Annu. Rev. Genet., 29, 445, 10.1146/annurev.ge.29.120195.002305 BEUTLER, 1989, Evolution of the genome and the genetic code: Selection at the dinucleotide level by methylation and polyribonucleotide cleavage, Proc. Nat. Acad. Sci. U.S.A., 86, 192, 10.1073/pnas.86.1.192 BLATTNER, 1997, The complete genome sequence of Escherichia coli K-12, Science, 277, 1453, 10.1126/science.277.5331.1453 BORK, 1995, Exploring the Mycoplasma capricolum genome: a minimal cell reveals its physiology, Molec. Microbiol., 16, 955, 10.1111/j.1365-2958.1995.tb02321.x BRENDEL, 1986, Linguistics of Nucleotide Sequences: morphology and comparison of vocabularies, J. Biomol. Struct. Dyn., 4, 11, 10.1080/07391102.1986.10507643 BULT, 1996, Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii, Science, 273, 1066, 10.1126/science.273.5278.1058 BURGE, 1992, Over- and under-representation of short oligonucleotide in DNA sequences, Proc. Nat. Acad. Sci. U.S.A., 89, 1358, 10.1073/pnas.89.4.1358 DOOLITTLE, 1997, Microbial genomes opened up, Nature, 392, 339, 10.1038/32789 ENDO, 1997, Evolutionary significance of intragenome duplications on human chromosomes, Gene, 205, 19, 10.1016/S0378-1119(97)00478-2 FARMER, 1996, Genomics: the next psychiatric revolution?, Br. J. Psych., 169, 135, 10.1192/bjp.169.2.135 FLEISCHMANN, 1995, Whole-genome random sequencing and assembly of Haemophilus influenzae Rd, Science, 269, 496, 10.1126/science.7542800 GOFFEAU, 1996, Life with 6000 genes, Science, 274, 546, 10.1126/science.274.5287.546 GOLDING, 1995, Protein-based phylogenies support a chimeric origin for the eukaryotic genome, Mol. Biol. Evol., 12, 1, 10.1093/oxfordjournals.molbev.a040178 GRIGORIEV, 1998, Analyzing genomes with cumulative skew diagrams, Nucl. Acids Res., 26, 2286, 10.1093/nar/26.10.2286 HAMADA, 1982, Potential Z-DNA forming sequences are highly dispersed in the human genome, Nature, 298, 396, 10.1038/298396a0 IKEMURA, 1990, Giant G+C% mosaic structures of the human genome found by arrangement of GenBank human DNA sequences according to genetic positions, Genomics, 8, 207, 10.1016/0888-7543(90)90273-W KARLIN, 1994, Comparisons of eukaryotic genomic sequences, Proc. Nat. Acad. Sci. U.S.A., 91, 12832, 10.1073/pnas.91.26.12832 KLUG, 1979, A hypothesis on a specific sequence-dependent conformation of DNA and its relation to the binding of the lac-repressor protein, J. Mol. Biol., 131, 669, 10.1016/0022-2836(79)90196-7 KOONIN, 1997, Prokaryotic genomes: the emerging paradigm of genome-based microbiology, Opinion Genetics Develop., 7, 757, 10.1016/S0959-437X(97)80037-8 KYPR, 1986, A part of codon bias in genes protects protein spatial structures from destabilization by random single points mutations, Biochem. Biophys. Res. Commun., 139, 1094, 10.1016/S0006-291X(86)80289-3 KYPR, 1987, Occurrence of nucleotide triplets in genes and secondary structure of the coded proteins, Int. J. Biol. Macromol., 9, 49, 10.1016/0141-8130(87)90024-9 KYPR, 1987, Unusual codon usage of HIV, Nature, 327, 20, 10.1038/327020a0 KYPR, 1989, Nucleotide composition bias and CpG dinucleotide content in the genomes of HIV and HTLV 1/2, Biochim. Biophys. Acta, 1009, 280, 10.1016/0167-4781(89)90114-0 KYPR, 1988, Conformations of DNA duplexes containing (dA-dT) sequences of bases and their possible biological significance, 105 LOBRY, 1996, Origin of replication of Mycoplasma genitalium, Science, 272, 745, 10.1126/science.272.5262.745 MCATEER, 1995, The effects of sequence context on base dynamics at TpA steps in DNA studied by NMR, Nucl. Acids Res., 23, 3962, 10.1093/nar/23.19.3962 MRÁZEK, 1988, Strand compositional asymmetry in bacterial and large viral genomes, Proc. Nat. Acad. Sci. U.S.A., 95, 3720, 10.1073/pnas.95.7.3720 MRÁZEK, 1992, GLOBIC: a very fast microcomputer program for fingerprinting, characterization and comparison of long nucleotide sequences, CABIOS, 8, 29 MRÁZEK, 1993, UNIREP: a microcomputer program to find unique and repetitive nucleotide sequences in genomes, CABIOS, 9, 355 MRÁZEK, 1995, Middle-range clustering of nucleotides in genomes, CABIOS, 11, 195 MUSHEGIAN, 1996, A minimal gene set for cellular life derived by comparison of complete bacterial genomes, Proc. Nat. Acad. Sci. U.S.A., 93, 10268, 10.1073/pnas.93.19.10268 NUSSINOV, 1981, Nearest neighbour nucleotide patterns, J. Biol. Chem., 256, 8458, 10.1016/S0021-9258(19)68865-4 NUSSINOV, 1984, Doublet frequencies in evolutionary distinct groups, Nucl. Acids. Res, 12, 1749, 10.1093/nar/12.3.1749 NUSSINOV, 1991, Compositional variations in DNA sequences, CABIOS, 7, 287 OHNO, 1989, Various regulatory sequences are deprived of their uniqueness by the universal rule of TA/CG deficiency and TG/CT excess, Proc. Nat. Acad. Sci. U.S.A., 87, 1218, 10.1073/pnas.87.3.1218 OLLILA, 1996, Sequence specificity in CpG mutation hotspots, FEBS Lett., 396, 119, 10.1016/0014-5793(96)01075-7 PEROUTKA, 1997, The medical utility of genomics data in neuropsychiatry: mutational genetics versus association genetics, Opinion Biotechnol., 8, 688, 10.1016/S0958-1669(97)80120-6 PIETROKOVSKI, 1990, Linguistic measure of taxonomic and functional relatedness of nucleotide sequences, J. Biol. Struct. Dyn., 7, 1251, 10.1080/07391102.1990.10508563 PIETROKOVSKI, 1992, Imported sequences in the mitochondrial yeast genome identified by nucleotide linguistics, Gene, 122, 129, 10.1016/0378-1119(92)90040-V POWELL, 1996, Polymorphism revealed by simple sequence repeats, Trends Plant Sci., 1, 215, 10.1016/S1360-1385(96)86898-0 ROWEN, 1997, Sequencing the human genome. Polymorphism revealed by simple sequence repeats, Science, 278, 605, 10.1126/science.278.5338.605 SAKAMOTO, 1993, Development of the overlapping oligonucleotide database and its application to signal sequence search of the human genome, CABIOS, 9, 427 SANTALUCIA, 1998, A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics, Proc. Nat. Acad. Sci. U.S.A., 95, 1460, 10.1073/pnas.95.4.1460 SELKOV, 1997, A reconstruction of the metabolism of Methanococcus jannaschii from sequence data, Gene, 197, 11, 10.1016/S0378-1119(97)00307-7 SEOIGHE, 1998, Extent of genomic rearrangement after genome duplication in yeast, Proc. Nat. Acad. Sci. U.S.A., 95, 4447, 10.1073/pnas.95.8.4447 SCHERER, 1994, Atypical regions in large genomic DNA sequences, Proc. Nat. Acad. Sci. U.S.A., 91, 7134, 10.1073/pnas.91.15.7134 SMITH, 1996, Microbial pathogen genomes—new strategies for identifying therapeutics and vaccine targets, Tibtech, 14, 290, 10.1016/0167-7799(96)10038-X SPARROW, 1976, Evolution of genome size by DNA doublings, Science, 192, 524, 10.1126/science.1257789 STONEKING, 1997, The human genome project and molecular anthropology, Genome Res., 7, 87, 10.1101/gr.7.2.87 STRAUSS, 1997, Microbial pathogenesis: genomics and beyond, Science, 276, 707, 10.1126/science.276.5313.707 TAUTZ, 1986, Cryptic simplicity in DNA is a major source of genetic variation, Nature, 322, 652, 10.1038/322652a0 TRIFONOV, 1994, On the recombinational origin of protein-sequence-subunit structure, J. Mol. Evol., 38, 543, 10.1007/BF00178853 TRIFONOV, 1995, Segmented structure of protein sequences and early evolution of genome by combinatorial fusion of DNA elements, J. Mol. Evol., 40, 337, 10.1007/BF00163239 TRIFONOV, 1997, Segmented structure of separate and transposable DNA and RNA elements as suggested by their size distributions, J. Biomolec. Struct. Dyn., 14, 449, 10.1080/07391102.1997.10508144 VENTER, 1998, Shotgun sequencing of the human genome, Science, 280, 1540, 10.1126/science.280.5369.1540 VOLINIA, 1989, The frequency of oligonucleotides in mammalian genic regions, CABIOS, 5, 33 WADA, 1984, Stability distribution in the phage λ -DNA double helix: A correlation between physical and genetic structure, J. Biomol. Struct. Dyn., 2, 573, 10.1080/07391102.1984.10507592 WOLFE, 1997, Molecular evidence for an ancient duplication of the entire yeast genome, Nature, 387, 708, 10.1038/42711 YOMO, 1989, Concordant evolution of coding and noncoding regions of DNA made possible by the universal rule of TA/CG deficiency—TG/CT excess, Proc. Nat. Acad. Sci. U.S.A., 86, 8452, 10.1073/pnas.86.21.8452