Computing the longest common prefix array based on the Burrows–Wheeler transform

Journal of Discrete Algorithms - Tập 18 - Trang 22-31 - 2013
Timo Beller1, Simon Gog1, Enno Ohlebusch1, Thomas Schnattinger1
1Institute of Theoretical Computer Science, University of Ulm, 89069 Ulm, Germany

Tài liệu tham khảo

Beller, 2011, Computing the longest common prefix array based on the Burrows–Wheeler transform, vol. 7024, 197 Brisaboa, 2009, Directly addressable variable-length codes, vol. 5721, 122 M. Burrows, D.J. Wheeler, A block-sorting lossless data compression algorithm, Research Report 124, Digital Systems Research Center, 1994. Cox, 2011, Lightweight BWT construction for very large string collections, vol. 6661, 219 Crochemore, 1998, Automata and forbidden words, Information Processing Letters, 67, 111, 10.1016/S0020-0190(98)00104-5 Culpepper, 2010, Top-k ranked document search in general text databases, vol. 6347, 194 Devroye, 1992, A note on the height of suffix trees, SIAM Journal on Computing, 21, 48, 10.1137/0221005 Ferragina, 2010, Lightweight data indexing and compression in external memory, vol. 6034, 697 P. Ferragina, G. Manzini, Opportunistic data structures with applications, in: Proc. IEEE Symposium on Foundations of Computer Science, 2000, pp. 390–398. Flick, 2009, Sense from sequence reads: Methods for alignment and assembly, Nature Methods, 6, S6, 10.1038/nmeth.1376 Gagie, 2009, Range quantile queries: Another virtue of wavelet trees, vol. 5721, 1 Gog R. Grossi, A. Gupta, J.S. Vitter, High-order entropy-compressed text indexes, in: Proc. 14th Annual ACM–SIAM Symposium on Discrete Algorithms, 2003, pp. 841–850. Gusfield, 1997 G. Hampikian, T. Andersen, Absent sequences: Nullomers and primes, in: Proc. 12th Pacific Symposium on Biocomputing, 2007, pp. 355–366. Herold, 2008, Efficient computation of absent words in genomic sequences, BMC Bioinformatics, 9, 167, 10.1186/1471-2105-9-167 Jacobson, 1989, Space-efficient static trees and graphs, 549 Kärkkäinen, 2007, Fast BWT in small space by blockwise suffix sorting, Theoretical Computer Science, 387, 249, 10.1016/j.tcs.2007.07.018 Kärkkäinen, 2009, Permuted longest-common-prefix array, vol. 5577, 181 Kärkkäinen, 2011, Fixed block compression boosting in FM-indexes, vol. 7024, 174 Kärkkäinen, 2003, Simple linear work suffix array construction, vol. 2719, 943 Kasai, 2001, Linear-time longest-common-prefix computation in suffix arrays and its applications, vol. 2089, 181 Lam, 2009, High throughput short read alignment via bi-directional BWT, 31 Langmead, 2009, Ultrafast and memory-efficient alignment of short DNA sequences to the human genome, Genome Biology, 10, 10.1186/gb-2009-10-3-r25 Léonard, 2012, On the number of elements to reorder when updating a suffix array, Journal of Discrete Algorithms, 11, 87, 10.1016/j.jda.2011.01.002 Li, 2009, Fast and accurate short read alignment with Burrows–Wheeler transform, Bioinformatics, 25, 1754, 10.1093/bioinformatics/btp324 Li, 2009, Soap2: an improved ultrafast tool for short read alignment, Bioinformatics, 25, 1966, 10.1093/bioinformatics/btp336 Lippert, 2005, Space-efficient whole genome comparisons with Burrows–Wheeler transforms, Journal of Computational Biology, 12, 407, 10.1089/cmb.2005.12.407 Lippert, 2005, A space-efficient construction of the Burrows–Wheeler transforms for genomic data, Journal of Computational Biology, 12, 943, 10.1089/cmb.2005.12.943 Manber, 1993, Suffix arrays: A new method for on-line string searches, SIAM Journal on Computing, 22, 935, 10.1137/0222058 Manzini, 2004, Two space saving tricks for linear time LCP array computation, vol. 3111, 372 Navarro, 2007, Compressed full-text indexes, ACM Computing Surveys, 39, 10.1145/1216370.1216372 Nong, 2009, Linear suffix array construction by almost pure induced-sorting, 193 Ohlebusch, 2010, CST++, vol. 6393, 322 Ohlebusch, 2010, Computing matching statistics and maximal exact matches on compressed full-text indexes, vol. 6393, 347 Okanohara, 2009, A linear-time Burrows–Wheeler transform using induced sorting, vol. 5721, 90 Pinho, 2009, On finding minimal absent words, BMC Bioinformatics, 10, 137, 10.1186/1471-2105-10-137 Puglisi, 2007, A taxonomy of suffix array construction algorithms, ACM Computing Surveys, 39, 1, 10.1145/1242471.1242472 Puglisi, 2008, Space–time tradeoffs for longest-common-prefix array computation, vol. 5369, 124 T. Schnattinger, Bidirektionale indexbasierte Suche in Texten, Diploma thesis, University of Ulm, Germany, 2010. Simpson, 2010, Efficient construction of an assembly string graph using the FM-index, Bioinformatics, 26, i367, 10.1093/bioinformatics/btq217 Sirén, 2009, Compressed suffix arrays for massive data, vol. 5721, 63 Sirén, 2010, Sampled longest common prefix array, vol. 6129, 227 Umar, 2006, Longest-common-prefix computation in Burrows–Wheeler transformed text Wu, 2010, Efficient computation of shortest absent words in a genomic sequence, Information Processing Letters, 110, 596, 10.1016/j.ipl.2010.05.008