Boost logo

Boost :

Subject: Re: [boost] [Container] interest in a flat_unordered_map
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2015-09-18 20:31:12


On Fri, Sep 18, 2015 at 11:09 PM, Treb Connell <trebconnell_at_[hidden]> wrote:
> I'm hoping to port a flat hash map that I've written to boost.
>
> *Performance (primary reason for use):*
> It uses memory and cache very effectively and ends up being *very fast*.
> It's hard to benchmark real world situations, but in inserting, finding,
> erasing, and iterating over a dictionary it was up to six orders of
> magnitude faster than VS2013's unordered_map. This isn't a totally fair
> test because the cache efficiency really shines.

I believe, much of the benefit depends on the nature of keys and
values (e.g. ints vs. strings)? Is there a more detailed description
of the tests and results, at different container sizes?

> The most optimal implementation of iterators for this container *does not
> have constant time traversal*. The implementation is a vector with
> elements scattered across it. To increment an iterator I continue to
> increment the internal vector iterator until I hit a new element. In
> practice elements should be scattered evenly, and will be close to one
> another, but in the worst case this is N-1. Additional memory or logic
> could be added to make this O(1), but I don't think that trade is worth
> it. In my applications of unordered_map I rarely iterate, and would not
> want to penalize insert, find, or erase to make iteration faster.

Can the iterator tradeoff be made a compile time option?

One frequently used operation that generally amounts to iterating
through elements is clear(). Do I understand correctly that its
complexity can be O(N) where N>size()? I believe the same applies to
range erase().

Also, is size() constant time?

> I could make a new concept for this, pay the cost for making it truly O(1),
> or *since the common case is not O(N) would it be okay to consider these
> bidirectional iterators*?

I think the iterators should have bidirectional traversal category
regardless of the choice on the complexity. It's the closest standard
category that fits anyway.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk