Boost logo

Boost :

Subject: Re: [boost] [flat_set] When to sort?
From: Vladimir Batov (Vladimir.Batov_at_[hidden])
Date: 2017-03-26 21:01:15


On 03/26/2017 10:35 PM, Andrey Semashev via Boost wrote:
> On 03/26/17 14:00, Vladimir Batov via Boost wrote:
>> On 2017-03-26 21:28, Andrey Semashev via Boost wrote:
>>
>>> What you're probably seeing is the overhead that comes from inserting
>>> in the middle of the vector that is used by flat_set internally. This
>>> especially matters if elements have inefficient move/copy.
>>
>> Yes, I initially thought so too... but the value_type is merely a pair
>> of pointers. The comparison operation is expensive. So, it appears it
>> plays quite a role during initial construction. When I eliminated that
>> as described and sorted once... huge boost.
>
> Well, if the ordering predicate plays the defining role, I don't quite
> see where your gain is. Inserting N elements in an empty flat_set or
> inserting N elements in a vector and then sorting would give you the
> same number of calls to the predicate, unless I'm missing something.

I must have failed to describe my use-case as it is not "inserting N
elements in an empty flat_set". In fact, it is rather inserting quite a
number (about 50,000 to 200,000 so far) of value_type instances on
one-by-one basis.

> I still think that moving around lots of pointers on every insertion
> (you said the end size was ~50000 elements and I'm assuming normal
> distribution of insert positions) is the culprit. E.g. towards the end
> of construction inserting an element in the middle would have to move
> 50000 * 8 = ~390 KiB of data, which is quite a lot.

It might well be. I did not profile what exactly contributes so much
during flat_set construction. I simply chopped the whole construction
off (replaced with straight emplace_back) to see huge 30-50% improvement.

> Did you try profiling the code?

:-) You are just teasing me, right? :-)

>
>>> No, this won't work in general because these methods are not supposed
>>> to modify the container. In particular, they don't invalidate
>>> iterators or references, and calling these methods concurrently is
>>> thread-safe.
>>
>> I do not feel those begin() et al will invalidate anything because the
>> first call to, say, begin() will sort the underlying container once.
>> After that it all works as usual, right?
>
> No, not quite.
>
> flat_set< int > fs;
> fs.reserve(5);
> auto it1 = fs.begin();
> auto it2 = fs.insert(42).first;
> auto it3 = f2.begin(); // invalidates it1 and it2

Well, I do understand that insertion into a std::vector invalidates the
iterators. :-) I thought you argued (when you said "No, this won't work
in general because...") that "my" approach (of only sorting on
when-needed basis) had different to the flat_set behavior. I argued that
it was the same. That is, no changes in behavior but a huge construction
performance boost. That fact should not probably be simply ignored.

> Also, as I mentioned, multiple threads may me calling begin() or
> find() concurrently. These methods must also have const-qualified
> versions to be able to be used on non-mutable containers.

I can't see any of it as a fundamental problem. They all addressed in
implementation:

template<typename T, typename Sort>
struct aux::sorted_vector
{ ...
     iterator begin () { sort_(); return all_.begin(); }
     iterator end () { sort_(); return all_.end(); }
     const_iterator begin () const { sort_(); return all_.begin(); }
     const_iterator end () const { sort_(); return all_.end(); }
     ...

the sort_() takes care of sorting (if/when needed) and MT safety.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk