Boost logo

Boost :

Subject: Re: [boost] [flat_set] When to sort?
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2017-03-26 10:28:49


On 03/26/17 06:02, Vladimir Batov via Boost wrote:
> I've been using boost::container::flat_set with quite a success until
> recently when I profiled my app to see that 30-50% of the run-time was
> spent in insert() for a container of around 50,000 in size.
>
> So, I had to replace boost::container::flat_set with a vector variation
> which inserts as a vector (no sorting). However, when it is the time to
> call begin(), end(), find(), it sorts once. That shaved the mentioned
> overhead off. Say, one run is down from 17s to 10s and the likes.

AFAIU, flat_set should not sort on insert. It should do lower_bound to
find the insert position, which is O(log(N)). Sorting instead would give
O(N*log(N)).

What you're probably seeing is the overhead that comes from inserting in
the middle of the vector that is used by flat_set internally. This
especially matters if elements have inefficient move/copy. Flat
containers are not well suited for frequent modifications, so that is
kind of expected.

> I wonder if that should be the boost::container::flat_set default
> behavior rather than sorting the underlying container with every
> insert(). After all, as a user I do not care if the underlying container
> is sorted... until I start using/traversing it, right?

No, this won't work in general because these methods are not supposed to
modify the container. In particular, they don't invalidate iterators or
references, and calling these methods concurrently is thread-safe.

> Or maybe I am
> missing something in boost::container::flat_set which already allows me
> to do something of that sort?

You can try using the methods and constructors that accept the
ordered_unique_range tag. If you can prepare a sequence of ordered and
unique elements, you can call insert(ordered_unique_range, s.begin(),
s.end()) or similarly construct the flat_set to avoid ordering the
sorted elements.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk