DNA, dichotomic classes and frame synchronization: a quasi-crystal framework

Simone Giannerini1, Diego Luis Gonzalez2,1, Rodolfo Rosa2,1
1Dipartimento di Scienze Statistiche, Università di Bologna, via delle Belle Arti 41, 40126 Bologna, Italy
2CNR-IMM Sezione di Bologna, Via Gobetti 101, 40129 Bologna, Italy

Tóm tắt

In this article, we show how a new mathematical model of the genetic code can be exploited for investigating the almost periodic properties of DNA and mRNA protein-coding sequences. We present the main mathematical features of the model and highlight its connections with both number theory and group theory. The group theoretic framework presents interesting analogies with the theory of crystals. Moreover, we exploit the information provided by dichotomic classes, binary variables naturally derived from the mathematical model, in order to build statistical classifiers for retrieving and predicting the normal reading frame used by the ribosome in protein synthesis. The results show that coding sequences possess a local informational structure that can be related to frame synchronization processes. The information for retrieving the normal reading frame, which implies the existence of short-range correlations and almost periodic structures related to the organization of codons, offers an interesting analogy with the properties of quasi-crystals. From a theoretical point of view, our results might contribute to clarifying the relation between biological information and shape in nucleic acids and proteins. Also, from the point of view of applications, we present new promising tools for designing efficient algorithms for frame synchronization, which plays a crucial role in faithful synthesis of proteins.

Từ khóa


Tài liệu tham khảo

10.1073/pnas.0405844101

10.1146/annurev.micro.60.080805.142304

10.1063/1.1359699

10.1098/rsta.2011.0231

10.1196/annals.1378.083

10.3390/ijms9122424

10.1017/CBO9781139644129

10.1073/pnas.77.7.3816

10.1006/jmbi.1996.0341

10.1073/pnas.43.5.416

10.4153/CJM-1958-023-9

10.1511/1998.1.8

10.1016/j.compbiolchem.2005.11.001

10.1016/j.camwa.2006.12.090

10.1006/jtbi.1996.0142

10.1016/j.jtbi.2011.01.028

Gonzalez D. L., 2004, Can the genetic code be mathematically described?, Med. Sci. Monit., 10, 11

Gonzalez D. L., 2008, The codes of life: the rules of macroevolution of biosemiotics, 111, 10.1007/978-1-4020-6340-4_6

Wolfram S., 2002, A new kind of science

Rumer Yu B., 1966, Proc. Acad. Sci. USSR (Doklady), 1393

10.1016/S0022-5193(88)80196-6

Štambuk N., 1999, On circular coding properties of gene and protein sequences, Croat. Chem. Acta, 72, 999

10.1103/PhysRevLett.71.4401

10.1023/A:1025715209867

Karasev V. A., 2005, Advances in bioinformatics and its application of mathematical biology and medicine, 482, 10.1142/9789812702098_0044

10.1109/MEMB.2006.1578666

10.1103/PhysRevE.78.051918

Gonzalez D. L., 2009, The mathematical structure of the genetic code: a tool for inquiring on the origin of life, Statistica, 143

10.1504/IJBRA.2009.027519

Weindl J., 2007, IEEE Int. Conf. and ICC'07, Communications, 24–28 June 2007, 833, 10.1109/ICC.2007.142

10.1007/978-3-540-35306-5_10

Lassez J.-L., 2007, Proc. 21st Int. Conf. on Advanced Information Networking and Applications Workshops, Washington, DC, USA, AINAW'07, 21–23 May 2007, 745

10.1016/0022-2836(87)90241-5

10.1155/BSB/2006/23613

10.1038/npre.2012.7136.1

10.1007/s11084-006-9041-6