|
Boost : |
From: Steven Ross (spreadsort_at_[hidden])
Date: 2008-07-16 10:14:44
On Sat, Jul 12, 2008 at 11:20 AM, Phil Endecott <
spam_from_boost_dev_at_[hidden]> wrote:
> The problem with that is that you might recurse a lot and fill your stack,
> so in practice you'd want to limit it to a few iterations and then give up
> and put a "wrong" value back at the source. Also you need to be certain
> that 'tmp' gets some sort of return value optimisation and isn't copied
> during each return. An explicitly unrolled iterative version would probably
> be best:
>
> Whether it actually improves performance will depend on quite how large the
> structs are. It certainly won't help with ints. So let's see your version
> that sorts structs!
Hi Phil,
I implemented this change, using a 3-way swap (n = 2; 4 copies, 2 items
placed in the right position) and sent it through my regressions, and saw a
7% speed improvement. It required also iterating over the bins, as you
originally suggested. What I also discovered was that I could get a 6%
speed improvement on my original algorithm just by replacing std::swap with
an explicitly inlined swap and changing which item gets assigned to tmp in
the swap (the order matters!), so the improvement (on integers) of your
multi-way swap on its own is about 1%. By doing the 3-way swap I should get
the majority of the speedup of your technique, as the marginal speedup
rapidly degrades as you increase n. For my testing on integers, a 4-way
swap seemed slower than a 3-way swap; my guess is that the 3-way swap is
easier to optimize.
I will upload two sample apps to www.sourceforge.net/projects/spreadsort in
the file release; one with integers, and one with integer + data.
I've attached two files:
philspreadsort.hpp
spreadsort.hpp
(plus an updated constants.h)
I'd welcome people comparing to see which is faster, and by how much, on
various platforms.
I'll note that I can add a memory optimization to philspreadsort.hpp (count
no longer has to be in the bins for speed), but haven't done so for now.
Assuming philspreadsort.hpp is at least as fast on average as
spreadsort.hpp, I'll run with philspreadsort.hpp, on the assumption that the
swapping of larger data types will be faster.
We're getting close to making integer_sort faster than std::sort for 1000
elements!
Steve
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk