
Boost : 
Subject: Re: [boost] [sort] Re: [review] Formal review period for Sort library begins today, November 10, and ends Wednesday, November 1
From: Steven Ross (spreadsort_at_[hidden])
Date: 20141123 06:46:40
On Fri Nov 14 2014 at 10:43:48 PM Steven Ross <spreadsort_at_[hidden]> wrote:
> On Thu Nov 13 2014 at 2:07:53 AM Vladimir Prus <vladimir_at_[hidden]>
> wrote:
>
what input data distribution is this? Could you run the tests with
>> worstcase data distribution
>> for your algorithm?
>>
>
> The input data distribution was evenly distributed 32bit random data.
> The absolute worstcase distribution I've been able to design for
> spreadsort is binaryalrbreaker with an ALR_THRESHOLD of 3.
> With 32bit integers, it has 512 elements, and takes 15 microseconds for
> integer_sort, and 14 microseconds for std::sort.
> With 64bit integers, it has 131072 elements, and takes 2.8ms for
> integer_sort, and 1.6ms for std::sort.
> If you add or subtract any elements, it won't be the worstcase
> distribution! If I simply double the number of elements, 64bit
> integer_sort takes 5.35ms, and std::sort takes 4.56ms. I'm not sure what
> kind of graph would make sense with this type of situation. I'd also like
> to note that this distribution is specifically designed to make
> integer_sort slow, and cannot be scaled up without improving the relative
> speed of integer_sort.
>
In analyzing this data to see why it was slower with integer_sort than
std::sort, I realized that it was creating the input data in reversesorted
order. This ends up being a case where std::sort is substantially faster
than usual (as branch prediction becomes very accurate), making it not a
worstcase comparison. I changed the code to swap 10% of the inputs with
random other inputs to break up the reversesorted order, and both
integer_sort and std::sort slowed down to comparable speeds. I also
regenerated the data (the least significant bits are randomized) and reran
the sort 10 times in a loop (on new data each time) to reduce noise. The
net result is:
std::sort elapsed time 0.049344
integer_sort elapsed time 0.046429
(results in seconds)
integer_sort is 6% faster on 64bit integers, on my Ubuntu system, for
integer_sort's worstcase distribution.
The code is checked into the develop branch
in example/binaryalrbreaker.cpp. From the <BOOST_ROOT>/libs/sort
directory, you can run b2 tune binaryalrbreaker and then run the
generated binaryalrbreaker executable.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk