Gerbil: a fast and memory-efficient k-mer counter with GPU-support

Marius Erbert1, Steffen Rechner1, Matthias Müller–Hannemann1
1Institute of Computer Science, Martin Luther University Halle-Wittenberg, Von-Seckendorff-Platz 1, 06120, Halle (Saae), Germany

Tóm tắt

Từ khóa


Tài liệu tham khảo

Xavier BB, Sabirova J, Pieter M, Hernalsteens J-P, de Greve H, Goossens H, Malhotra-Kumar S. Employing whole genome mapping for optimal de novo assembly of bacterial genomes. BMC Res Notes. 2014;7(1):1–4. doi: 10.1186/1756-0500-7-484 .

Chikhi R, Medvedev P. Informed and automated k-mer size selection for genome assembly. Bioinformatics. 2014;30(1):31–7. doi: 10.1093/bioinformatics/btt310 .

Sameith K, Roscito JG, Hiller M. Iterative error correction of long sequencing reads maximizes accuracy and improves contig assembly. Brief Bioinform. 2016;18:1–8. doi: 10.1093/bib/bbw003 .

Erbert M, Rechner S, Müller-Hannemann M. Gerbil: a fast and memory-efficient k-mer counter with gpu-support. In International workshop on algorithms in bioinformatics. Berllin: Springer; 2016. p. 150–161.

Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27(6):764–70.

Melsted P, Pritchard JK. Efficient counting of k-mers in DNA sequences using a bloom filter. BMC Bioinform. 2011;12(1):1–7. doi: 10.1186/1471-2105-12-333 .

Rizk G, Lavenier D, Chikhi R. DSK: k-mer counting with very low memory usage. Bioinformatics. 2013;29(5):652–3.

Deorowicz S, Debudaj-Grabysz A, Grabowski S. Disk-based k-mer counting on a PC. BMC Bioinform. 2013;14(1):1–12. doi: 10.1186/1471-2105-14-160 .

Roy RS, Bhattacharya D, Schliep A. Turtle: identifying frequent k-mers with cache-efficient algorithms. Bioinformatics. 2014;30(14):1950–7. doi: 10.1093/bioinformatics/btu132 .

Li Y, et al. MSPKmerCounter: a fast and memory efficient approach for k-mer counting. arXiv preprint arXiv:1505.06550 ; 2015.

Deorowicz S, Kokot M, Grabowski S, Debudaj-Grabysz A. KMC 2: fast and resource-frugal k-mer counting. Bioinformatics. 2015;31(10):1569–76. doi: 10.1093/bioinformatics/btv022 .

Pérez N, Gutierrez M, Vera N. Computational performance assessment of k-mer counting algorithms. J Comput Biol. 2016;23(4):248–55.

Mamun AA, Pal S, Rajasekaran S. Kcmbt: a k-mer counter based on multiple burst trees. Bioinformatics. 2015;345:2783–90.

Suzuki S, Ishida T, Akiyama Y. Masanori Kakuta: accelerating identification of frequent k-mers in DNA sequences with GPU. In: GTC; 2014.

Roberts M, Hunt BR, Yorke JA, Bolanos RA, Delcher AL. A preprocessor for shotgun assembly of large genomes. J Comput Biol. 2004;11(4):734–52.

Roberts M, Hayes W, Hunt BR, Mount SM, Yorke JA. Reducing storage requirements for biological sequence comparison. Bioinformatics. 2004;20(18):3363–9. doi: 10.1093/bioinformatics/bth408 .

Kim KE, Peluso P, Babayan P, Yeadon PJ, Yu C, Fisher WW, Chin CS, Rapicavoli NA, Rank DR, Li J, et al. Long-read, whole-genome shotgun sequence data for five model organisms. Sci Data. 2014;1:140045.