Boost logo

Boost :

Subject: Re: [boost] radix sort
From: Jeremy Murphy (jeremy.william.murphy_at_[hidden])
Date: 2013-07-25 21:43:38


On 25 July 2013 20:15, Steven Ross <spreadsort_at_[hidden]> wrote:

> On Tue, Jul 23, 2013 at 11:19 AM, Jeremy Murphy <
> jeremy.william.murphy_at_[hidden]> wrote:
>
>
> Yes, radix sorts work great when the data type is small (such as 8-bit or
> 16-bit integers). To get a better comparison against std::sort for the
> larger integers, I suggest you use data in a range roughly matching the
> size of the data type (what is RAND_MAX on your system? It's usually less
> than 1 << 16).
>

Actually, RAND_MAX = INT_MAX = (1 << 31) - 1 on my system (glibc-2.15). So
how do I generate random 64-bit numbers... (rand() << 32) + rand()? Bah, I
should just be using Boost.Random!

As a note, MSD radix sort is extremely fast on the evenly-distributed
> random data that it looks like you're using. I test Spreadsort with a
> bunch of uneven distributions that make the problem harder, in addition to
> a few tests with evenly distributed random data. LSD radix sort doesn't
> care much about distribution, but most other algorithms (including
> std::sort) do.

Yes, I understand that's the assumption that bucket sort works on. So I
guess that's also an argument for having multiple, different
implementations. I'm keen to do a comprehensive comparative performance
test, all the algorithms across different distributions, but it will take
some time.

I've also been thinking that the LSD radix-sort can be optimized
(especially for large-width integers) where k <= (k_max >> 1). That is,
where some of the most-significant digits are not being used, the number of
column/digit sort passes with counting-sort can potentially be reduced.
 Anyway, maybe in version 2.0. :)

Jeremy


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk