Arne Maus' Sorting home page (latest update November 2015)
This page presents seven original sorting algorithms (+ variants) in Java designed by Arne Maus. All algorithms are downloadable and their usage are regulated by the BSD license (basically that their original author must always be credited whenever used). All code is accompanied by a pubished paper explaining its usage, performance and limitations. Disclaimer: Although all these algorithms are thoroughly tested, no guarantee is given that they work as intended in any application where they might be used. |
|
|
The algorithms
All code (in Java) have a small test program in 'main' so you can test the algorithms . These algorithms are, like almost all memory intensive algorithms, rather sensitive for the number of caches on the CPU and their speed relative to main memory; as demonstrated in this paper. All these algorithms operate on integer array with n non-negative numbers.
Let M be the largest element in the array, and b be the maximum number of bits in any sorting digit (tunable). Code is also in some cases given for handling a mixture of positive and negative numbers, but this handling is always done in a pre-processing, interfacing method, this is generally avoided to concentrate on the algorithms.
- NEW 2015:Full Parallel QuickSort (ParaQuick)
A full parallel version of Quicksort where there is no slow start with first sequential partition of the array a[] in two,
then a 2 parallel splitting into four parts,... ParaQuick starts with k =64 threads in full parallel Quicksorting of 1/64 part of a[] each, and then combining these parts to a valid 2-way split of a[]. Then each of these two parts can then with half of the threads be split in the same way,.., until a section only has one thread. Then a call to ordinary, sequential quicksort is made. It uses 64 threads even though your CPU might only have 4 or 8 cores. ParaQuick responses well to such 'overbooking' of threads to cores. More details in the paper.
Main new features in algorithms:Full parallel Quicksort.
1) ParaQuick, 2015,- Implemented with a few supporting arrays of constant length: #threads (=64).
On a 4 core Intel i5 CPU, ParaQuick is a 1.5-3.5 times faster algorithm than both the dual split Java library Quicksort algorithm: Array.sort() and the usual parallelization of Quicksort, when the number of elements to be sorted > 500 000. Improvements has been made to the code so it performs somewhat better than the figures in the paper.
Uses: Insertion sort and sequential Quicksort as a sub algorithms. Constant and minimal extra space. Not stable
Execution time: O(n log n).
Uses: Code: FullParaQuickTest java
Paper: A full parallel Quicksort algorithm for multicore processors
[NIK?2015, Norwegian Informatics Conf. Alesund, 2015. Tapir, www.nik.no]
Sequential ARL (Adaptive Radix Left)
Two left (most significant digit) recursive decent Radix algorithms for sorting in ascending order an integer array with n non-negative numbers.
Let M be the largest element in the array, and b be the maximum number of bits in any sorting digit (tunable)
Main new features in algorithms:In place sorting with permutational cycles & dynamically sized radix (adjusting to distribution).
2) Unstable ARL (Adaptive Radix Left), 2002,- a) Implemented with two supporting arrays of length 2**b.
The generally preferred unstable replacement for (the unstable) Quicksort.
Uses Insertsort as a sub algorithm.
Execution time: O(n log M) . Uses: O(2**b) extra space. Not stable
More than twice as fast as Quicksort (n > 1000).
Code: ARLSort java
Paper: ARL, a faster in-place, cache friendly sorting algorithm
[NIK?2002, Norwegian Informatics Conf. Kongsberg, 2002. Tapir, ISBN 82-91116-45-8. www.nik.no]
- b) Implemented with one supporting array of length 2**b.
Somewhat slower than 1a), but uses half the amount of extra space
Uses Insertsort as a sub algorithm.
Execution time: O(n log M) . Uses: O(2**b) extra space. Not stable
More than twice as fast as Quicksort (n > 1000).
Code: ARLSortN java
Paper: ARL, a faster in-place, cache friendly sorting algorithm
[NIK?2002, Norwegian Informatics Conf. Kongsberg, 2002. Tapir, ISBN 82-91116-45-8. www.nik.no]
- Implemented with a few supporting arrays of constant length: #threads (=64).
- 3) Stable ARL (Adaptive Radix Left) 2006,
- implemented with two supporting arrays of length 2**b.
The generally preferred stable replacement for (the unstable) Quicksort.
Uses Insertsort and (on rare occasions) ordinary right radix as sub algorithms.
Execution time: O(n log M) . Uses: O(2**b) extra space. Stable
Almost twice as fast as Quicksort (n > 1000).
Code: ARLSortS.java
Paper: Making a fast unstable sorting algorithm stable
[NIK?2006, Norwegian Informatics Conf, Molde, 2006. Tapir, ISBN 82-519-2186-4. www.nik.no]
- implemented with two supporting arrays of length 2**b.
- Parallel ARL, PARL (Parallel Adaptive Radix Left)
A full parallel version of algorithm 1a) above for a multicore CPU with k cores and one shared main memory, A pool of k threads is started by generating a sorting object (ParaSort p = new ParaSort();) To sort an array a, the user only calls the soring method within the sorting object p ( p.pARL(a);). The users does not create, start or stop threads - they see this as plain sequental code.
Main new features in algorithm: A full parallel algorithm, meaning that all the barrier synchronized sorting steps are of length < n/k, and there is only one sequential statement in the algorithm (Amdahl's law).
4) PARL ( Parallel Adaptive Radix Left), 2011,- Implemented with k threads, and basically a set of supporting arrays with a total length = n + 2**b.
The fastest replacement for (the unstable) Quicksort. Very effective even on dual core CPUs
Uses Insertsort and sequential ARL as a sub algorithms.
Execution time: O(n log M/k) . Uses: O(n+ 2**b) extra space. Not stable
Faster then Quicksort and from 4 and up to 30 times as fast as Quicksort (n > 150000), depending on the number of cores.
Code: PARLSort java
Paper: A full parallel radix sorting algorithm for multicore processors
[NIK?2011, Norwegian Informatics Conf. Troms�, 2011. Tapir, ISBN 978-82-519-2843-4. www.nik.no]
- Implemented with k threads, and basically a set of supporting arrays with a total length = n + 2**b.
- Buffered Adaptive Radix
Two Radix algorithms for sorting in ascending order an integer array with n non-negative numbers with the help of a potentially smaller buffer. One algorithm is the iterative right (least significant digit) radix algorithm and the other is the left recursive (most significant digit first ) radix sorting algorithm Both algorithms moves data back and forth between the array and a buffer of hopefully the same length as the array.to be sorted. However, if only a smaller buffer is available, these two algorithms avoids program abortion by the system ('no more heap space') by using a the largest buffer available and demonstrate how to modify the two Radix algorithms to use more passes to copy data back and forth between buffer and array - thus saving space by using somewhat more time.
Main new features in algorithms:Radix sorting with the help of a non-zero buffer of any size <= n. They also adapt the radix size to the distribution of numbers to be sorted (smaller radixes for a sparse distributions).
5) Buffered Adaptive Radix Right 2007- Iterative,, right to left Radix sorting.
Execution time: O(n log M) . Uses: O(n) extra space. Stable
Approx. twice as Quicksort (n > 250)when buffer size is 100% of array length (n), faster when > 20%
The interface method is long (and stable) and handles also negative numbers.
Code: BARsort.java
Paper: Buffered Adaptive Radix : a fast, stable sorting algorithm that trades speed for space at runtime when needed.
[NIK?2007, Norwegian Informatics Conf. HiO, Oslo, Tapir, ISBN 978-82-519-2277-2. www.nik.no]
- Iterative,, right to left Radix sorting.
- 6) Buffered Adaptive Radix Left 2007
- Recursive decent, left to right Radix sorting.
Implemented with two supporting arrays of length 2**b.
A possible stable replacement for (the unstable) Quicksort.
Execution time: O(n log M) . Uses: up to O(n) extra space, but survives on any extra space > 0. Stable
Approx. twice as Quicksort (n > 250)when buffer size is 100% of array length (n), faster when > 20%
The interface method is long (and stable) and handles also negative numbers.
Code: BLRadix.java
Paper: Buffered Adaptive Radix : a fast, stable sorting algorithm that trades speed for space at runtime when needed.
[NIK?2007, Norwegian Informatics Conf. HiO, Oslo Tapir, ISBN 978-82-519-2277-2. www.nik.no]
- Recursive decent, left to right Radix sorting.
- Full Parallel Right Radix 2014
Two Radix algorithms for sorting in ascending order an integer array with n non-negative numbers with the help of a same sized buffer. This is 5-10 times as fast as Arrays.sort on randomly filled arrays.
7) Full Parallel Radix Right - 2014- Iterative, right to left Radix sorting.
Execution time: O(n) . Uses: O(n) extra space. Stable
Approx. 5-10 times fast as Arrays.sort (n > 200 000).
Code: ParaRadix.java
You can also download a version with extensive testing possibilities : TestandCompareParaRadix.java
Paper: Arne Maus and Stein Gjessing:"Practical Parallel Programming ? a plan for a B.S. course on how to design efficient parallel algorithms.", EduPar14, Poenix, AZ, 2014
- Iterative, right to left Radix sorting.
Paper, the effects of caching on sorting
- Arne Maus and Stein Gjessing:
- A Model for the Effect of Caching on Algorithmic Efficiency in Radix based Sorting, The Second International Conference on Software Engineering Advances, ICSEA 25.Aug. 2007, France
Latest update 19. November 2015 (soon to be published: Effective sequential and parallel ShellSort).