Codeword design and information encoding in DNA ensembles
Tóm tắt
Encoding of information in DNA-, RNA- and other biomolecules is animportant area of research in fields such as DNA computing,bioinformatics, and, conceivably, microbiology and genetics. This surveyfocuses on two fundamental problems, the codeword design problemand the representation problem of abiotic information, formassively parallel processing with DNA molecules. The first problemrequires libraries of DNA sequences to be designed so that specificduplexes are formed during annealing while simultaneously preventingother undesirable hybridizations from occurring in the course of acomputation in the tube. The second involves a search for efficient andcost-effective methods of representing non-biological information in DNAsequences for storage and retrieval of large amouns of data (tera- andpeta-byte scales). Two approaches are treated, namely thermodynamic andcombinatoric-computational. Both experimental and theoretical resultsare described. A reference list of major works in the area is given.Finally, some open problems deemed important for their possible impacton encoding of abiotic information representation and processing arediscussed.
Tài liệu tham khảo
Adleman L (1994) Molecular computation of solutions of combinatorial problems. Science 266: 1021–1024
Allawi HT and SantaLucia J Jr (1997) Thermodynamics and NMR of internal G·Cmis matches in DNA. Biochemistry 36: 10581–10594
Andronescu M, Dees D, Slaybaugh L, Zhao Y, Condon AE, Cohen B and Skiena S (2002) Algorithms for testing that DNA word designs avoid secondary structures. In: Ohuchi Hagiya A. (ed), pp. 182–195
Arita M and Kobayashi S (2002) DNA sequence design using templates. New Generation Computing 20(3): 263–277
Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, Struhl K, Wang-Iverson P and Bonitz SG (1993) Current Protocols in Molecular Biology. Greene Publishing Associates and Wiley-Interscience, New York
Bahnzhaf W, Daida J, Eiben AE, Garzon MH, Hanovar V, Jakiela M and Smith RE (1999) Proc. of GECCO-99, the Genetic and Evolutionary Computation Conference, Orlando, Florida, Morgan Kaufmann, San Mateo, CA
Baum E (1995) Building an associative memory vastly larger than the brain. Science 268: 583–585
Baum E (1996) DNA sequences useful for computation. In: Landweber and Baum (eds), pp. 122–127
Ben-Dor A, Karp R, Schwikowski B and Yakhini Z (2000) Universal DNA tag systems: A combinatorial design scheme. In: Proceedings of the Fourth International Conference on Computational Molecular Biology, ACM. Tokyo, Japan, April 8–11
Bi H, Chen J, Deaton R, Garzon M, Rubin H and Wood D (2003) A PCR-based protocol for in vitroselection of non-crosshybridizing oligonucleotides. J. of Natural Computing 2(3): 417–426
Bloomfield VA, Crothers DM and Tinoco I Jr (2000) Nucleic Acids: Structures, Properties, and Functions. University Science Books, Sausalito, CA
Boneh D, Dunworth C, Lipton RJ and Sgall J (1996) Making DNA computers error-resistant. In: Landweber and Baum (eds), pp. 163–170
Borer PN, Dengler B, Tinoco I Jr. and Uhlenbeck OC (1974) Stability of ribonucleic acid double-stranded helices. J. Mol. Biol. 86: 843–853
Brenneman B and Condon A (2001) Sequence Design for Biomolecular Computation. In press. Available at http://www.cs.ubc.edu/~condon/papers/wordsurvey.ps
Cantor CR and Schimmel PR (1980) Biophysical Chemistry, Part III: The Behavior of Biological Macromolecules. Freeman, New York
Carbone A and Seeman N (2002) Circuits and programmable self-assembling DNA structures. Proc. of the National Academy of Science USA 99: 12577–12582
Condon A and Rozenberg G eds (2000) Proc. 6th Int. Workshop on DNA-Based Computers DNA 2000, Leiden, The Netherlands. Lecture Notes in Computer Science LNCS 2054, Springer-Verlag
Cukras A, Faulhammer D, Lipton R and Landweber L (1998) Chess games: A model for RNA-based computation. In: Kari et al. (eds), pp. 35–45
Deaton R, Kim JW and Chen J (2003) Design and test of non-crosshybridizing oligonucleotide building blocks for DNA computers and nanostructures. Appl. Phys. Lett. 82: 1305–1307
Deaton RJ, Chen J, Bi H, Garzon M, Rubin H and Wood DH (2002a) A PCR-based protocol for in vitroselection of non-crosshybridizing oligonucleotides. In: Ohuchi Hagiya A (eds), pp. 105–114
Deaton RJ, Chen J, Bi H and Rose JA (2002b) A software tool for generating noncrosshybridizing libraries of DNA oligonucleotides. In: Ohuchi Hagiya A (eds), pp. 211–220
Deaton R, Murphy R, Rose J, Garzon M, Franceschetti D and Stevens SE Jr (1997a) Good encodings for DNA solution to combinatorial problems. Proc. IEEE Conference on Evolutionary Computation, IEEE Computer Society Press (1997), pp. 267–271
Deaton R, Garzon M and Rose JA (1997b) A DNA based artificial immune system for self-nonself discrimination. Proc. of the IEEE Int. Conference on Systems, Man and Cybernetics, Orlando. IEEE Press, Piscataway, NJ, pp. 369–374
Deaton R, Murphy R, Garzon M, Franceschetti D and Stevens E Jr (1996) Good encodings for DNA solution to combinatorial problems. Second Princeton Conference on DNA-based Computers, June 1996. DIMACS series of the American Mathematical Society 44 (1998), pp. 247–258
Faulhammer D, Cukras A, Lipton R and Landweber L (1999) When the knight falls: On constructing an RNA computer. In: Winfree D. Gifford (ed.), pp. 1–7
Feldkamp U, Saghafi S, Banzhaf W and Rouche H (2001) DNA sequence generator: A program for the construction of DNA sequences. In: Jonoska and Seeman, 2002, pp. 23–32
Fraenkel AS (1997) Protein folding, spin glass and computational complexity. In: Rubin and Wood (eds), pp. 101–121
Garey MR and Johnson DS (1979) Computers and Intractability. Freeman, New York
Garzon M (ed) (2003a) Biomolecular machines and artificial evolution. Special Issue of the Journal of Genetic Programming and Evolvable Machines 4(2). Kluwer Academic Publishers
Garzon M, Blain D, Bobba K, Neel A and West M (2003b) Self-assembly of DNA-like structures in-silico. In: Garzon (ed.), pp. 185–200
Garzon M, Neel A and Chen H (2003c) Efficiency and reliability of DNA-based memories. In: E. Cantu-Paz et al. (eds), Proc. GECCO-2003, The Genetic and Evolutionary Programming Conference. Springer-Verlag Lecture Notes in Computer Science 2723, pp. 379–389
Garzon M, Bobba K and Hyde B (2003d) Digital information encoding on DNA. In: Jonoska N, Paun G, Rozenberg G (eds), Aspects of Molecular Computing. Springer-Verlag, Lecture Notes in Computer Science 2950, pp. 152–166
Garzon M, Neel A and Bobba K (2003e) Efficiency and reliability of semantic retrieval in DNA-based memories. In: J. Reif and J. Chen (eds) Proc. DNA9-2003, Int. Workshop on DNA-based Computers. Springer-Verlag Lecture Notes in Computer Science, in press
Garzon M and Oehmen C (2001) Biomolecular computing in virtual test tubes. In: Jonoska and Seeman, 2002, pp. 117–128
Garzon M and Deaton RJ (2000) Biomolecular computing: A definition. Kunstliche Intelligenz 1/00: 63–72
Garzon M and Deaton RJ (1999) Biomolecular computing and programming. IEEE Trans. on Evolutionary Comp. 3(2): 36–50
Garzon M, Deaton R, Rose JA and Franceschetti DR (1999b) Soft molecular computing. In: Winfree and Gifford (eds), pp. 89–98
Garzon M, Neathery PI, Deaton R, Murphy RC, Franceschetti DR and Stevens SE Jr (1997a) A new metric for DNA computing. In: Koza et al., 1997, pp. 472–478
Garzon M, Deaton R, Neathery P, Murphy RC, Franceschetti DR and Stevens E Jr (1997b) On the encoding problem for DNA computing. Poster at The Third DIMACS Workshop on DNA-based Computing, U of Pennsylvania. Preliminary Proceedings, pp. 230–237
Gray DM(1997) Derivation of nearest-neighbor properties from data on nucleic acid oligomer. I. Simple sets of independent sequences and the influence of absent nearest-neighbors. Biopolymers 42: 783–793
Gray DM and Tinoco I Jr (1970) A new approach to the study of sequence-dependent properties of polynucleotides. Biopolymers 9: 223–244
Hagiya M and Ohuchi A eds (2002) Proc. 8th Int. Meeting on DNA-Based Computers. Springer-Verlag Lecture Notes in Computer Science LNCS 2568. Springer-Verlag
Hartemink AJ and Gifford DK (1997) Thermodynamic simulation of deoxyoligonucleotide hybridization of DNA computation. In: Rubin and Wood (eds), pp. 25–37
Head T and Yamamura M and Gal S (2001) Relativized code concepts and multi-tube DNA dictionaries, in press
Head T, Yamamura M and Gal S (1999) Aqueous computing: Writing on molecules. Proceedings of the Congress on Evolutionary Computing (CEC'99)
Heitsch CE, Condon AE and Hood HH (2002) From RNA secondary structure to coding theory: A combinatorial approach. In: Hagiya and Ohuchi, pp. 125–136
Hinze T, Hatnik U and Sturm M (2001) An object oriented simulation of real occurring molecular biological processes for DNA computing and its experimental verification. In: Jonoska and Seeman, 2002, pp. 1–13
Holland JH (1992) Adaptation in Natural and Artificial Systems. MIT Press, Cambridge, MA
Hopcroft, J and Ullman J (1979). Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading, MA
Hussini S, Kari L, Rubin H and Konstantinidis S (2001) Coding properties of DNA languages. In: Jonoska and Seeman, 2002, pp. 57–69
Jonoska, N and Seeman N eds (2001) Proc. 7th Int. Workshop on DNA-Based Computers DNA 2000, Tampa, Florida. Lecture Notes in Computer Science LNCS 2340, Springer-Verlag, 2002
Kari L, Kitto R and Thierrin G (2000) Codes, envolutions and DNA encoding. In: Proc. Workshop on Coding Theory, London, Ontario, in press
Kari L, Rubin H and Wood D eds (1998) Proc. 4th DIMACS workshop on DNA Computers, University of Pennsylvania, 1998. Special issue of Biosystems 52: 1–3 (1999)
Karp R, Kenyon C and Waarts O (1996) Error-resilient DNA Computation. Proc. 7th Annual Symposium on Discrete Algorithms SODA, pp. 458–467
Karp RM and Wigderson A (1985) A fast parallel algorithm for the maximal independent set problem. Journal of the Association for Computing Machinery 32: 762–773
Klein JP, Leete TH and Rubin H (1997) A biomolecular implementation of logically reversible computation with minimal energy dissipation. In: Rubin and Wood, pp. 15–23
Kool ET (1999) New molecular strategies for joining, pairing, and amplifying. In: Winfree and Gifford, p. 62
Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H and Riolo RL (eds) (1998) Proc. 3rd Annual Genetic Programming Conference. Morgan Kaufmann, San Mateo, California
Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H and Riolo RL (eds) (1997) Proc. 2nd Annual Genetic Programming Conference. Morgan Kaufmann, San Mateo, California
Landweber, L and Kari L (1999) The evolution of cellular computing: Nature's solution to a computational problem. In: Kari et al., 1998, pp. 3–13
Landweber LF and Baum EB (eds) (1996) DNA Based Computers II. Proc. of the Second DIMACS Workshop on DNA-Based Computers, Princeton University. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, RI: American Mathematical Society, vol. 44 (1998)
Luby M (1986) A simple parallel algorithm for the maximal independent set problem. SIAM Journal on Computing 15: 1036–1053
Mao C, LaBean TH, Reif JH and Seeman NC (2000) Logical computation using algorithmic self-assembly of DNA triple-crossover molecules. Nature 407: 493–496
Marathe A, Condon AE and Corn RM(1999) On combinatorial DNA word design. In:Winfree and Gifford, 1999, pp. 75–88
McCaskill JS (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29: 1105–1119
Mezard M, Parisi G and Virasoro MA (1987) Spin Glass Theory and Beyond.World Scientific, Singapore
Mitchell J and Yurke B (2001) DNA scissors. In: Jonoska and Seeman, 2002, pp. 263–268
Monasson R, Zecchina R, Kirkpatrick S, Selman B and Troyansky L (1999) Determining computational complexity from characteristic phase transitions. Nature 400: 133–137
Mount D (2001) Bioinformatics: Sequence and Genome Analysis. Spring Harbor Lab Press
MD Nusser M and Deaton RJ (2003) DNA computing with in vitroselection. In: Garzon, pp. 173–183
Poland D and Scheraga HA (1970) Theory of Helix-Coil Transitions in Biopolymers. Academic Press, Inc., New York
Reif JH, LaBean TH, Pirrung M, Rana VS, Guo B, Kingsford C and Wickman GS (2001) Experimental construction of very-large scale databases with associative search capability. In: Jonoska and Seeman, 2002, pp. 231–247
Roman J (1995) The Theory of Error-Correcting Codes. Springer-Verlag, Berlin
Rose JA, Deaton RJ, Hagiya M and Suyama A (2000) The fidelity of the tag-antitag system. In: Jonoska and Seeman, 2001, pp. 138–149
Rose JA, Deaton R, Franceschetti DR, Garzon MH, Stevens SE Jr (1999) A statistical mechanical treatment of error in the annealing biostep of DNA computation. Banzhaf et al., pp. 1829–1834
Rose JA, Deaton R, Garzon M, Murphy RC, Franceschetti DR and Stevens SE Jr (1997) The effect of uniform melting temperatures on the efficiency of DNA computing. In: Rubin and Wood, pp. 35–42
Rubin H and Wood D (eds) (1997) Third DIMACS Workshop on DNA-Based Computers, The University of Pennsylvania, 1997. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, RI. American Mathematical Society 48 (1999)
SantaLucia J (2001) DNA hybridization is not digital. Plenary address at the Seventh International Meeting on DNA Based Computers
SantaLucia J Jr (1998) A unified view of polymer, dumbell, and oligonucleotide nearestneighbor thermodynamics. Proc. Natl. Acad. Sci. 95: 1460–1465
SantaLucia J Jr, Allawi HT and Seneviratne PA (1996) Improved nearest-neighbor parameters for predicting DNA duplex stability. Biochemistry 35: 3555–3562
Shin SY, Kim DM, Lee IH and Zhang BT (2002) Evolutionary Sequence Generation for Reliable DNA Computing. In IEEE Congress on Evolutionary Computation 2002, in press
Shor PW (1996) Fault-tolerant quantum computation. Proc. IEEE Foundations of Computer Science FOCS, 56–65
Smith TF and Waterman MS (1981) The identification of common molecular subsequences. J. Mol. Biol. 147: 195–197
Wartell RM and Benight AS (1985) Thermal denaturation of DNA molecules. Physics Reports 126: 67–107
Watson JD, Hopkins NH, Roberts JW, Steitz JA and Weiner AM (1987) Molecular Biology of the Gene, 4th ed. The Benjamin/Cummings Publishing Co., Menlo Park, CA
Weigt M and Hartmann AK (2000) Number of guards needed by a museum: A phase transition in vertex covering of random graphs. Phys. Rev. Lett. 84: 6118–6121
West M, Garzon MH and Blain D (2003) DNA-like genomes for evolution in silico. In: E. Cantu-Paz et al. (eds), Proc. GECCO-2003, The Genetic and Evolutionary Programming Conference. Springer-Verlag Lecture Notes in Computer Science 2723, pp. 413–424
Wetmur JG (1997) Physical chemistry of nucleic acid hybridization. In: Rubin and Wood, pp. 1–23
Winfree E and Gifford D eds. (1999) Fifth International Metting on DNA Based Computers, MIT, Boston, MA. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, RI. American Mathematical Society, 44 (2000)
Winfree E, Liu F, Wenzler LA and Seeman NC (1998) Design and self-assembly of two-dimensional DNA crystals. Nature 394: 539–544
Yamamoto M, Yamashita J, Shiba T, Hirayama T, Takiya S, Suzuki K, Munekata M and Ohuchi A (1999) A study on the hybridization process in DNA computing. In: Winfree and Gifford, pp. 101–110
Yurke B and Mills AP Jr (2003) Using DNA to power Nanostructures. In: Garzon, 2003a, pp. 111–122
Yurke B, Tauberfield AJ, Mills AP Jr, Simmel FC and Neumann JL (2000) A DNA-fuelled molecular machine made of DNA. Nature 406: 605–608
Zhang BT and Shin SY (1998) Molecular algorithms for efficient and reliable DNA computing. In: Koza et al., 1998, pp. 735–742