Engineering Parallel String Sorting

Springer Science and Business Media LLC - Tập 77 - Trang 235-286 - 2015
Timo Bingmann1, Andreas Eberle1, Peter Sanders1
1Karlsruhe Institute of Technology, Karlsruhe, Germany

Tóm tắt

We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we first propose string sample sort. The algorithm makes effective use of the memory hierarchy, uses additional word level parallelism, and largely avoids branch mispredictions. Then we focus on NUMA architectures, and develop parallel multiway LCP-merge and -mergesort to reduce the number of random memory accesses to remote nodes. Additionally, we parallelize variants of multikey quicksort and radix sort that are also useful in certain situations. As base-case sorter for LCP-aware string sorting we describe sequential LCP-insertion sort which calculates the LCP array and accelerates its insertions using it. Comprehensive experiments on five current multi-core platforms are then reported and discussed. The experiments show that our parallel string sorting implementations scale very well on real-world inputs and modern machines.

Tài liệu tham khảo

Akiba, T.: Parallel string radix sort in C++. http://github.com/iwiwi/parallel-string-radix-sort (2011). Git repository accessed November 2012

Cole, R.: Parallel merge sort. SIAM J. Comput. 17(4), 770–785 (1988)

Hagerup, T.: Optimal parallel string algorithms: sorting, merging and computing the minimum. In: 16th ACM Symposium on Theory of Computing (STOC), pp. 382–391 (1994)

Hoare, C.A.R.: Quicksort. Comput. J. 5(1), 10–16 (1962)

Knöpfle, S.D.: String samplesort. Bachelor Thesis, Karlsruhe Institute of Technology, in German (2012)

Rantala, T.: Library of string sorting algorithms in C++. http://github.com/rantala/string-sorting (2007). Git repository accessed November 2012

Shamsundar, N.: A fast, stable implementation of mergesort for sorting text files. http://code.google.com/p/lcp-merge-string-sort (2009). Source downloaded November 2012