Replacing suffix trees with enhanced suffix arrays

Journal of Discrete Algorithms - Tập 2 - Trang 53-86 - 2004
Mohamed Ibrahim Abouelhoda1, Stefan Kurtz2, Enno Ohlebusch1
1Faculty of Computer Science, University of Ulm, 89069 Ulm, Germany
2Center for Bioinformatics, University of Hamburg, 20146, Hamburg, Germany

Tài liệu tham khảo

Abouelhoda, 2002, The enhanced suffix array and its applications to genome analysis, vol. 2452, 449 Abouelhoda, 2002, Optimal exact string matching based on suffix arrays, vol. 2476, 31 Apostolico, 1985, The myriad virtues of subword trees, 85 Bender, 2000, The LCA problem revisited, 88 Bentley, 1997, Fast algorithms for sorting and searching strings, 360 M. Burrows, D.J. Wheeler, A block-sorting lossless data compression algorithm, Research Report 124, Digital Systems Research Center, 1994 Chang, 1994, Sublinear approximate string matching and biological applications, Algorithmica, 12, 327, 10.1007/BF01185431 Delcher, 1999, Alignment of whole genomes, Nucl. Acids Res., 27, 2369, 10.1093/nar/27.11.2369 Delcher, 2002, Fast algorithms for large-scale genome alignment and comparison, Nucl. Acids Res., 30, 2478, 10.1093/nar/30.11.2478 Ferragina, 2000, Opportunistic data structures with applications, 390 Ferragina, 2001, An experimental study of an opportunistic index, 269 P. Ferragina, G. Manzini, On compressing and indexing data, Technical Report TR-02-01, Dipartimento di Informatica, Università di Pisa, 2002 Gonnet, 1992, New indices for text: PAT trees and PAT arrays, 66 Grossi, 2000, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, 397 Gusfield, 1997 D. Gusfield, J. Stoye, Linear time algorithms for finding and representing all the tandem repeats in a string, Report CSE-98-4, Computer Science Division, University of California, Davis, 1998 Höhl, 2002, Efficient multiple genome alignment, Bioinformatics, 18, S312, 10.1093/bioinformatics/18.suppl_1.S312 Hon, 2002, Space-economical algorithms for finding maximal unique matches, vol. 2373, 144 Kärkkäinen, 2003, Simple linear work suffix array construction, vol. 2719, 943 Kasai, 2001, Linear-time longest-common-prefix computation in suffix arrays and its applications, vol. 2089, 181 Kim, 2003, Linear-time construction of suffix arrays, vol. 2676, 186 Knight Ko, 2003, Space efficient linear time construction of suffix arrays, vol. 2676, 200 Kolpakov, 1999, Finding maximal repetitions in a word in linear time, 596 Kurtz, 1999, Reducing the space requirement of suffix trees, Software—Practice and Experience, 29, 1149, 10.1002/(SICI)1097-024X(199911)29:13<1149::AID-SPE274>3.0.CO;2-O S. Kurtz, A time and space efficient algorithm for the substring matching problem, Technical Report, Zentrum für Bioinformatik, Universität Hamburg, 2003 Kurtz, 2001, REPuter: the manifold applications of repeat analysis on a genomic scale, Nucl. Acids Res., 29, 4633, 10.1093/nar/29.22.4633 Lander, 2001, Initial sequencing and analysis of the human genome, Nature, 409, 860, 10.1038/35057062 Manber, 1993, Suffix arrays: a new method for on-line string searches, SIAM J. Comput., 22, 935, 10.1137/0222058 O'Keefe, 2000, The pathological consequences and evolutionary implications of recent human genomic duplications, 29 Sadakane, 2002, Succinct representations of lcp information and improvements in the compressed suffix arrays, 225 Weiner, 1973, Linear pattern matching algorithms, 1 Ziv, 1977, A universal algorithm for sequential data compression, IEEE Trans. Inform. Theory, 23, 337, 10.1109/TIT.1977.1055714 Ziv, 1978, Compression of individual sequences via variable length coding, IEEE Trans. Inform. Theory, 24, 530, 10.1109/TIT.1978.1055934