Representation, searching and discovery of patterns of bases in complex RNA structures

Journal of Computer-Aided Molecular Design - Tập 17 - Trang 537-549 - 2003
Anne-Marie Harrison1, Darren R. South1, Peter Willett2, Peter J. Artymiuk1
1Krebs Institute for Biomolecular Research, Departments of Molecular Biology and Biotechnology, and Information Studies, University of Sheffield, Western Bank, Sheffield, U.K
2Information Studies, University of Sheffield, Western Bank, Sheffield, U.K

Tóm tắt

We describe a graph theoretic method designed to perform efficient searches for substructural patterns in nucleic acid structural coordinate databases using a simplified vectorial representation. Two vectors represent each nucleic acid base and the relative positions of bases with respect to one another are described in terms of distances between the defined start and end points of the vectors on each base. These points comprise the nodes and the distances the edges of a graph, and a pattern search can then be performed using a subgraph isomorphism algorithm. The minimal representation was designed to facilitate searches for complex patterns but was first tested on simple, well-characterised arrangements of bases such as base pairs and GNRA-tetraloop receptor interactions. The method performed very well for these interaction types. A survey of side-by-side base interactions, of which the adenosine platform is the best known example, also locates examples of similar base rearrangements that we consider to be important in structural regulation. A number of examples were found, with GU platforms being particularly prevalent. A GC platform in the RNA of the Thermus thermophilus small ribosomal subunit is in an analogous position to an adenosine platform in other species. An unusual GG platform is also observed close to one of the substrate binding sites in Haloarcula marismortui large ribosomal subunit RNA.

Tài liệu tham khảo