The contribution of slippage-like processes to genome evolution
Tóm tắt
Simple sequences present in long (>30 kb) sequences representative of the single-copy genome of five species (Homo sapiens, Caenorhabditis elegans Saccharomyces cerevisiae, E. coli, and Mycobacterium leprae) have been analyzed. A close relationship was observed between genome size and the overall level of sequence repetition. This suggested that the incorporation of simple sequences had accompanied increases of genome size during evolution. Densities of simple sequence motifs were higher in noncoding regions than in coding regions in eukaryotes but not in eubacteria. All five genomes showed very biased frequency distributions of simple sequence motifs in all species, particularly in eukaryotes where AAA and TTT predominated. Interspecific comparisons showed that noncoding sequences in eukaryotes showed highly significantly similar frequency distributions of simple sequence motifs but this was not true of coding sequences. ANOVA of the frequency distributions of simple sequence motifs indicated strong contributions from motif base composition and repeat unit length, but much of the variation remained unexplained by these parameters. The sequence composition of simple sequences therefore appears to reflect both underlying sequence biases in slippage-like processes and the action of selection. Frequency distributions of simple sequence motifs in coding sequences correlated weakly or not at all with those in noncoding sequences. Selection on coding sequences to eliminate undesirable sequences may therefore have been strong, particularly in the human lineage.