Boost logo

Boost :

Subject: Re: [boost] [flat_set] When to sort?
From: Vladimir Batov (Vladimir.Batov_at_[hidden])
Date: 2017-03-26 21:49:41


On 03/27/2017 06:03 AM, Oswin Krause via Boost wrote:
>
>> There is another variation of this approach, one I'm sure we've
>> discussed before: the flat_set keeps the unsorted range at the end and
>> on lookup tries both, first the unsorted part, then the sorted one.
>> The unsorted part is merged into place when it exceeds some size. This
>> doesn't do non-const operations on logically const lookups, but
>> iteration becomes problematic if iterators have to give you a sorted
>> view. (Although you could keep the suffix part sorted as well and
>> merge on demand in iter::op++ if so inclined.)
>
> This would have consequences both in asymptotic run time (lookups are
> O(log(N)+K) where K is the maximum allowed size of the unsorted
> region) as well as constants (more complex code, more ifs etc). If K
> is independent of N, we would still have overall quadratic time of
> inserting a lot of elements using insert(single_element). So this does
> not help.
>
> What I meant with my previous comment: it is probably the easiest way
> just to buffer all elements to be inserted in an intermediate storage
> area and then insert them at once.

I am sorry but I have to disagree on this one. I can't possibly see it
as the "easiest way". My use-case (which I do not believe is exotic by
any stretch of imagination) goes around and cherry-picks the elements
essentially one-by-one. So, having another container to populate feels
like a huge hassle. I thought the whole idea of flat_set was to get the
benefits without the hard work. :-) If I have to have such a container
to begin with, I might not need flat_set at all as I'd be then
transferring straight into std::vector, right?

> Of course we might not want to use the current insert for this as this
> would require storing all new elements. Here is my approach to this:
>
> flat_set maintains an array and three iterators(set_begin, set_end and
> vector_end). the range [set_begin,set_end) contains all inserted
> elements and size() begin(), end() will only return this range. the
> range [set_end, vector_end) contains all newly inserted elements,
> however they´are still "staging" and will not show up unless the
> flat_set is told to.
>
> so we would add two new methods
>
> staging_insert(single_element)//pushes element to the back of the
> staging area
> sort_staging()//calls sort() on the old range. afterwards
> set_end=staging_end.

That feels quite complex... what is wrong with my (seemingly) simple
suggestion of sorting on when-needed basis?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk