Boost logo

Boost :

Subject: Re: [boost] [sort] [container] Challenge for sorting experts, improved stable sort
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2015-09-23 10:55:29


Ion Gazta?aga wrote:
> On 22/09/2015 18:41, Phil Endecott wrote:
>
>> Have you considered making this general-purpose by adding a
>> feature to vector that exports an allocator, or a memory_resource,
>> that could be used by any code that needs a temporary buffer
>> while it can guarantee not to cause the vector to expand?
>> You could then modify inplace_merge, stable_sort etc. to take
>> such an allocator to use instead of get_temporary_buffer. And
>> you could choose whether or not to use a fallback, i.e.
>> get_temporary_buffer, if the spare space in the vector is
>> insufficient.
>
> Interesting. Could you please elaborate the basic ideas of this
> approach?

Well there's nothing magic, but using polymorphic memory resources
it would be something like this; you could do something similar
with Allocators:

// Add features to vector to get access to its beyond-the-end
// storage:
class vector_unused_storage: public memory_resource { ... };
vector_unused_storage vector::get_unused_storage() const;

// Add a version of inplace_merge that takes a memory resource
// to use instead of get_temporary_buffer:
template< class BidirIt >
void inplace_merge( BidirIt first, BidirIt middle, BidirIt last,
                     memory_resource& mr );

// A vector containing two ordered subranges:
vector<int> v;
v.push_back(42);
v.push_back(123);
v.push_back(99);
v.push_back(100);
// Merge the two subranges, using the vector's own unused beyond-
// the-end storage for any temporary that is required:
inplace_merge(v.begin(), v.begin()+2, v.end(),
               v.get_unused_storage());

I'll just say again - I think this is an interesting thought
experiment, but I'm not at all convinced that it's actually
useful in practice!

> I think there are some "guarantees" or "expected behaviour" when
> inserting to a flat container:
>
> -> The standard insertion function, IMHO, should guarantee that if
> capacity() - size() is bigger than the range to be inserted, vector
> won't be reallocated (after all, capacity() should mean something ;-)).

Yes, certainly.

> -> About using temporary storage (except the stack) in a method of a
> container: I haven't seen it in any STL implementation.

Probably true. But there are some std algorithms that do use
temporary storage, and I don't see any particular reason why
methods of containers should have different requirements than
non-member algorithms. Perhaps the committee had some rationale
for not providing an allocator interface to algorithms that need
temporary storage, and for the existence of get_temporary_buffer.

> Even
> std::list::sort, which maybe could be optimized using a temporary
> storage offering random-access iterators, is always implemented using
> temporary lists and merging them.

It can't have its big-O complexity improved though.

> allocating
> temporary storage from the allocator or new/malloc in every range
> insertion sounds like a very good method to fragment memory.

Shrug. As is often the case, there is a tradeoff here:
- Insert in the naive way and you get O(N^2) execution time.
- Insert in the smart way and you get O(N log N) execution time but
   maybe you suffer some memory fragmentation issue.

>>> For already sorted inputs and multikey containers, when
>>> size()+std::distance(begin, end) > capacity(),
>>
>> I think you mean < capacity(), right?
>
> No, sorry. I think I didn't explained it very well. When the current
> remaining capacity (capacity() - size()) is less than the size of range
> to be inserted (that is, the vector has to be reallocated because new
> elements don't fit in the capacity), and the input is sorted
> (ordered_range_t overloads), then a new buffer is allocated and
> std::merge() can be used to achieve a O(N) range insertion.

Ah, OK, I see. Yes that makes sense.

>>> Not yet, but if known algorithms are used, the difference between
>>> inplace_sort|merge and sort|merge is theoretically big. Merge becomes
>>> O(N) instead of O(NlogN). Sort becomes O(logN) instead of O(N(logN)^2).
>>
>> I think you're misusing the term "inplace" in that paragraph. I think
>> you mean "with no additional memory". "inplace" just means that the
>> result ends up back where the input was. All std:: sorts are inplace.
>
> Yes, you are right, let's call them "no additional memory" methods.
>
>> std::sort is never O(N(logN)^2) (or O(logN), but I presume that's a
>> typo!). std::stable_sort is allowed be, but see below.
>
> Sorry if I didn't explain it clearly: I'm always talking about a stable
> sort, I think it's the only alternative for flat containers.

Well that's worth thinking about for a moment. You know that keys are
unique in the "old" part of the container. Can that be exploited in
some useful way? Hmm, maybe not.

>> Right, stable sorts that are "almost" O(N log N) are known; they work
>> with blocks of size sqrt(N). I say "almost" because they have problems
>> when more than sqrt(N) items are equal to each other. I'm not sure if
>> that is solved. But these algorithms tend to have a constant factor of
>> maybe 5 or 7 compared to unstable sorts or sorts that use extra memory,
>> and I don't think I've ever seen one used anywhere. I guess it's because
>> of this that std::stable_sort is not required to be O(N log N).
>
> Thanks. It seems that some implementations have improved that constant
> factor. At least some benchmarks against std::stable_sort in the
> following links tell us that they might quite competitive:
>
> https://github.com/Mrrl/GrailSort
> https://github.com/BonzaiThePenguin/WikiSort
>
> According to "Fast Stable Merging and Sorting in Constant Extra Space'"
> (http://comjnl.oxfordjournals.org/content/35/6/643.full.pdf), chapter
> "5.3 Constant of Proportionality Bounds":
>
> "We conclude that we are guaranteed a worst-case key-comparison and
> record-exchange grand total not greater than 2.5N*log2N"
>
> "This worst-case total compares favourably with average-case
> key-comparison and record-exchange totals for popular unstable methods:
> quick-sort's average-case figure is a little more than 1.4Nlog2N;
> heap-sort's is about 2.3Nlog2N. (These values are derived from the
> analysis in Ref. 14, where we count a single record movement at one
> third the cost of a two-record exchange.)

Well that's great then. Someone want to implement it?!

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk