Boost logo

Boost :

Subject: Re: [boost] [flat_set] When to sort?
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2017-03-26 22:20:37


On Mon, Mar 27, 2017 at 12:01 AM, Vladimir Batov via Boost
<boost_at_[hidden]> wrote:
> On 03/26/2017 10:35 PM, Andrey Semashev via Boost wrote:
>>
>> Well, if the ordering predicate plays the defining role, I don't quite see
>> where your gain is. Inserting N elements in an empty flat_set or inserting N
>> elements in a vector and then sorting would give you the same number of
>> calls to the predicate, unless I'm missing something.
>
> I must have failed to describe my use-case as it is not "inserting N
> elements in an empty flat_set". In fact, it is rather inserting quite a
> number (about 50,000 to 200,000 so far) of value_type instances on
> one-by-one basis.

I don't see how that changes anything.

>>>> No, this won't work in general because these methods are not supposed
>>>> to modify the container. In particular, they don't invalidate
>>>> iterators or references, and calling these methods concurrently is
>>>> thread-safe.
>>>
>>> I do not feel those begin() et al will invalidate anything because the
>>> first call to, say, begin() will sort the underlying container once.
>>> After that it all works as usual, right?
>>
>> No, not quite.
>>
>> flat_set< int > fs;
>> fs.reserve(5);

I've made a mistake here in my example, the following line is missing:

  fs.insert(10);

>> auto it1 = fs.begin();
>> auto it2 = fs.insert(42).first;
>> auto it3 = f2.begin(); // invalidates it1 and it2
>
> Well, I do understand that insertion into a std::vector invalidates the
> iterators. :-) I thought you argued (when you said "No, this won't work in
> general because...") that "my" approach (of only sorting on when-needed
> basis) had different to the flat_set behavior.

I did, and the example above shows where your proposed change breaks
the previously valid code.

> I argued that it was the
> same. That is, no changes in behavior but a huge construction performance
> boost. That fact should not probably be simply ignored.

I'm not opposed to the idea of optimizing your use case. I just don't
think flat_set is the right tool for it.

The concept of self-organizing containers (including a
staging/scratchbook area) is not new, there is a precedent in
Boost.Intrusive at least. But those tricks typically don't work well
with certain common properties of traditional containers, such as the
ones I mentioned. For this reason it is best to provide these new
features under new names. IOW, propose a new container instead of
subverting behavior of an existing one.

>> Also, as I mentioned, multiple threads may me calling begin() or find()
>> concurrently. These methods must also have const-qualified versions to be
>> able to be used on non-mutable containers.
>
> I can't see any of it as a fundamental problem. They all addressed in
> implementation:
>
> template<typename T, typename Sort>
> struct aux::sorted_vector
> { ...
> iterator begin () { sort_(); return all_.begin(); }
> iterator end () { sort_(); return all_.end(); }
> const_iterator begin () const { sort_(); return all_.begin(); }
> const_iterator end () const { sort_(); return all_.end(); }
> ...
>
> the sort_() takes care of sorting (if/when needed) and MT safety.

How does sort_() handle thread safety? Are you proposing to add a
mutex to the container?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk