Subject: Re: [boost] [flat_set] When to sort?
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2017-03-26 11:35:07
On 03/26/17 14:00, Vladimir Batov via Boost wrote:
> On 2017-03-26 21:28, Andrey Semashev via Boost wrote:
>> What you're probably seeing is the overhead that comes from inserting
>> in the middle of the vector that is used by flat_set internally. This
>> especially matters if elements have inefficient move/copy.
> Yes, I initially thought so too... but the value_type is merely a pair
> of pointers. The comparison operation is expensive. So, it appears it
> plays quite a role during initial construction. When I eliminated that
> as described and sorted once... huge boost.
Well, if the ordering predicate plays the defining role, I don't quite
see where your gain is. Inserting N elements in an empty flat_set or
inserting N elements in a vector and then sorting would give you the
same number of calls to the predicate, unless I'm missing something.
I still think that moving around lots of pointers on every insertion
(you said the end size was ~50000 elements and I'm assuming normal
distribution of insert positions) is the culprit. E.g. towards the end
of construction inserting an element in the middle would have to move
50000 * 8 = ~390 KiB of data, which is quite a lot. Did you try
profiling the code?
>> No, this won't work in general because these methods are not supposed
>> to modify the container. In particular, they don't invalidate
>> iterators or references, and calling these methods concurrently is
> I do not feel those begin() et al will invalidate anything because the
> first call to, say, begin() will sort the underlying container once.
> After that it all works as usual, right?
No, not quite.
flat_set< int > fs;
auto it1 = fs.begin();
auto it2 = fs.insert(42).first;
auto it3 = f2.begin(); // invalidates it1 and it2
Also, as I mentioned, multiple threads may me calling begin() or find()
concurrently. These methods must also have const-qualified versions to
be able to be used on non-mutable containers.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk