Subject: Re: [boost] [multi_index][ann] Boost.MultiIndex 1.56 preview: major update with new hashed index implementation
From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2013-12-25 16:19:20
El 25/12/2013 18:12, Joaquín Mª López Muñoz escribió:
> The most important change is a complete reimplementation of the internals
> of hashed indices, so that erasure of elements is constant-time regardless
> of the level of occupancy of the internal bucket array (previously,
> performance degraded as the bucket was being depleted.) A full description
> of the new data structure as compared with other popular implementations of
> unordered associative containers is given at:
Joaquín, you blog posts are incredibly interesting. Congratulations.
Just in case you want to check it in the future, Boost.Intrusive uses a
policy-based data structure, but suffers from the same performance
degradation as MultiIndex had in pre-1.56 versions. AFAIK, the good old
bucket rooted singly linked list is the only one that can support
"auto-unlink" hooks (nodes that delete themselves from the container
without having a pointer to the container).
- It does avoid the "iterator erase(...)" problem returning "void"
instead of "iterator" (as Intrusive need not to be standard-conforming).
- If the user is willing to pay the price, it can offer "group" linking
as Boost.Unordered does for duplicated elements but although this does
not solve the "depleted bucket" syndrome. Your proxy-returning erase()
- It can also store the hash value in the node and this is used for
faster rehashing and lookups but it does not implement the
libstdc++/libc++/Boost.Unordered data structure. This was planned for
the future to improve erasure times.
- Power of two buckets can be optionally enabled to speed up some
operations if the user is sure the hash function is good enough.
You novel data structure is really interesting, although I guess it's
not easy to implement, specially when duplicates are around.
It's interesting to see how well the Dinkumware implementation performs,
as the memory cost is now lower than some years ago, now that the rest
of implementations have added the hash value in the node (the overhead
of the Dinkumware implementation is one pointer per bucket).
Congratulations again, I'm sure you'll be able to tune your
implementation and improve the performance in the future.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk