Boost logo

Boost :

Subject: Re: [boost] [sort] pdqsort
From: Marc Glisse (marc.glisse_at_[hidden])
Date: 2017-01-13 16:05:38


On Fri, 13 Jan 2017, Steven Ross wrote:

> I've been reviewing a proposed addition to Boost.Sort called pdqsort. When
> using a general-case partitioning algorithm, it performs comparably to
> std::sort in my testing (Windows and Linux). When used as a replacement
> for the std::sort fallback in spreadsort, using a branch-reducing
> optimization it is ~20% faster on ints and floats, and ~40% slower on
> strings. The concerning part is that any comparison operator that involves
> a branch may have a comparable slowdown to the string case.
> pdqsort is also significantly faster than std::sort on mostly-sorted data
> and some other (relatively common) special cases. This makes little
> difference when used as a fallback by spreadsort, but does make a
> difference when used on its own.
>
> I see these possible things to do with pdqsort:
> 1) Reject it completely as too similar to std::sort, which is already
> highly optimized.
> 2) Add it as another optional library in the Boost.Sort library.
> 3) Add it to Boost.Sort, and only use it as a fallback for ints and floats.
> 4) Add it to Boost.Sort, use it as a fallback for all of spreadsort, and
> have pdqsort itself only use its branch-reduction optimization on ints and
> floats.
> 5) #4, except eliminate the branch-reduction optimization completely from
> pdqsort for simplicity.
>
> Does anyone have a strong preference, advice, or comments on how they use
> Boost.Sort?
>
> Should I run a mini-review for pdqsort?

I assume this is the same pdqsort (https://github.com/orlp/pdqsort) that
was proposed for both libc++ and libstdc++ as an implementation of
std::sort. For libstdc++, sadly, it was never reviewed because the
reviewer turned out to be busier than he expected. I didn't follow what
happened for libc++.

>From my notes (https://gcc.gnu.org/ml/libstdc++/2015-10/msg00040.html), on
my application where copy is cheap (one pointer) but the comparison can be
a bit expensive (reading memory through several pointers), libstdc++ sort
was taking 3.46s, stable_sort 3.08s, and pdqsort 2.85s.

I am surprised you found it so much slower for strings. Was it C++03 or
C++11? COW strings or not? Short or long strings? Did the author have any
comment on this benchmark? Now the code may have changed since I tried it
in 2015, maybe it was re-tuned more for int/float (the branch-reducing
stuff?) and less for more expensive types since then...

(A naive question: do people ever sort a large, plain vector<int> or
vector<double>? I don't think I've ever used sort with basic types, or at
least not on large vectors where performance was critical. The closest
would have the int/double as one element of a pair.)

I'd rather see the various implementations of std::sort improved to the
point where integrating pdqsort in boost would be useless, but I am not
sure that's going to happen.

I am not sure what your benchmark involves exactly. If pdqsort is used
only as a fallback for spreadsort, that's a special kind of input that
might not represent what users would feed it if they called it directly.
However, a 40% slowdown is a serious cause for concern... You don't say if
disabling the branch-reduction option improves performance for strings,
which your #4 hints at.

-- 
Marc Glisse

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk