Boost logo

Boost Users :

Subject: Re: [Boost-users] Query regarding interest in my submitting plf::colony to Boost
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-07-13 22:03:37


On 13 Jul 2015 at 20:22, Soul Studios wrote:

> Hi all,
> I've had some expressions of support outside of the Boost community for
> submission of the plf::colony container and it's associated container
> plf::stack, mainly from the review judges for my cppcon submission on
> the same topic (topic didn't make it through to the finals but got a
> positive response).
> I'd like to guage reactions here to see whether it is worth spending the
> time necessary to port it to Boost, with all the associated reviews etc.

Firstly, you don't invest the multi-year effort to get a library into
Boost because it's superior, as it probably isn't.

You do it because of what you will become after you succeed: a Boost
library author who has passed the judgement of his or her peers.

> plf::colony is a container for unordered data, where erasure and
> addition do not cause iterator/pointer invalidation, with superior
> performance characteristics to all std:: containers when either
> (a) Pointers and iterators which point to container elements must stay
> valid regardless of container additions and deletions
> and/or
> (b) Additions to or deletions from the container will be occurring in
> performance-critical code.

I am struggling to see the gain over other combinations of STL
containers. If you need very fast deletion and never invalidated
iterators, unordered_map is very hard to beat - the cost is insertion
performance, but even that goes away with Hinnant's node_ptr proposed
extensions to the standard. Dense hash maps probably will be very
competitive, and we sorely need one of those in Boost and the STL.

I'd support a bitwise trie map (equal and near constant time
insert/find/erase), then a dense hash map (really just a vector).
Both of those I've seen a profound need for where no easy combination
of STL containers solved the problem. Your particular solution I've
not found a need for in my own coding yet. That's not to say there
isn't one, just I haven't encountered your problem space where an
application of std::vector combined with trivial customisation didn't
solve the scaling problem as needed.

> I am aware that it will be a lot of work to port colony to Boost, so I
> will only undertake this if there is genuine interest.

Don't do it because of that. Do it for yourself personally, or not at
all. Trust me: my library is up for review this Friday. I started the
process of getting it in in 2012. That's the commitment you need to
make.

> Asides from the performance characteristics, colonies have the following
> positive and negative properties:
> Positive: Lack of pointer/iterator invalidation makes interrelating
> between collections of data easier, without the cache misses associated
> with vectors of pointers to dynamically-allocated objects.
> Negative: Due to the additional overhead, performance can be worse for
> scalar types.

I'd have thought there is surely a space wastage going on here too if
I understand the algorithm. That can have cache spillage effects
under multithreaded use.

I'd benchmark your algorithm when running four benchmarks in
parallel. A lot of implementations which look faster single threaded
look very slow when multithreaded due to LL cache pressure. The STL
tries to balance those two off, and it generally succeeds,
sacrificing some single threaded performance for much improved cache
load.

FYI dense hash maps and bitwise trie maps have *excellent*
multithreaded performance, almost linear speedup with cores, thanks
to their low LL cache loading.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net