Boost logo

Boost :

Subject: Re: [boost] [sort] [container] Challenge for sorting experts, improved stable sort
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2015-09-22 12:41:28


Ion Gazta?aga wrote:
> On 21/09/2015 20:53, Phil Endecott wrote:
>> Why do you (still) want to sort the whole container, rather than just
>> sorting the new items and then using inplace_merge?
>
> Well, I actually want to have both merge and sort. And yes, if
> understand this correctly sort + merge would be faster.
>
> In any case the idea is to have both (sort and merge) as adaptable
> algorithms, so that the extra capacity of the vector can be used to
> speed up the both sort and merge.

Have you considered making this general-purpose by adding a
feature to vector that exports an allocator, or a memory_resource,
that could be used by any code that needs a temporary buffer
while it can guarantee not to cause the vector to expand?

You could then modify inplace_merge, stable_sort etc. to take
such an allocator to use instead of get_temporary_buffer. And
you could choose whether or not to use a fallback, i.e.
get_temporary_buffer, if the spare space in the vector is
insufficient.

(I'd want to see numbers convincing me of the practical usefulness
of this, though; I've worked on plenty of memory-constrained
projects and I've never been desperate enough to resort to this
sort of thing!)

> For already sorted inputs and multikey
> containers, when size()+std::distance(begin, end) > capacity(),

I think you mean < capacity(), right?

> then a
> usual O(N) merge can be done in the newly allocated memory. The latest
> version of flat containers (Boost 1.59) already does this.
>
>> The idea of using the allocated-but-unused part of the buffer for
>> temporary storage is an interesting one, which would benefit
>> inplace_merge as well as stable_sort. Have you tried to quantify
>> the benefits?
>
> Not yet, but if known algorithms are used, the difference between
> inplace_sort|merge and sort|merge is theoretically big. Merge becomes
> O(N) instead of O(NlogN). Sort becomes O(logN) instead of O(N(logN)^2).

I think you're misusing the term "inplace" in that paragraph. I think
you mean "with no additional memory". "inplace" just means that the
result ends up back where the input was. All std:: sorts are inplace.

std::sort is never O(N(logN)^2) (or O(logN), but I presume that's a
typo!). std::stable_sort is allowed be, but see below.

> When mergingsorting a range bigger than the extra capacity, steps up the
> the extra capacity would be O(N) and the last merge step would be
> N*logN. There are some mergesort implementations that only require N/2
> extra memory, which really makes the extra capacity more attractive:
>
> "Fast mergesort implementation based on half-copying merge algorithm"
> http://kicia.ift.uni.wroc.pl/algorytmy/mergesortpaper.pdf.
>
> When growing the vector, the extra capacity is sometimes quite big
> (depending on the growth factor). I think speedups will be noticeable.
> Obviously, nothing is real until it's measured.
>
> There are C++ implementations of stable sort algorithms which claim they
> beat std::stable_sort, only require O(1) extra memory (although I guess
> it's really O(logN) extra memory if recursion is used), and still
> guarantee O(N*log(N)) complexity instead of O(N*(logN)^2). Based on
> these papers:
>
> http://www.math.ntua.gr/~symvonis/publications/c_1994_S_Optimal%20Stable%20Merging.pdf
>
> https://github.com/BonzaiThePenguin/WikiSort/blob/master/tamc2008.pdf
>
> http://comjnl.oxfordjournals.org/content/35/6/643.full.pdf

Right, stable sorts that are "almost" O(N log N) are known; they work
with blocks of size sqrt(N). I say "almost" because they have problems
when more than sqrt(N) items are equal to each other. I'm not sure if
that is solved. But these algorithms tend to have a constant factor of
maybe 5 or 7 compared to unstable sorts or sorts that use extra memory,
and I don't think I've ever seen one used anywhere. I guess it's because
of this that std::stable_sort is not required to be O(N log N).

> In any case those implementations are mainly tested with integers, they
> use memcpy so work only with PODs and completely ignore exception
> safety. But fixing that seems quite easier than writing a new algorithm ;-)

Yes, I don't think there should be any fundamental issues with making
any of these sqrt(N) block-based algorithms more "C++".

Cheers, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk