Boost logo

Boost :

From: Matthew Austern (austern_at_[hidden])
Date: 2001-01-08 12:21:28


David Abrahams wrote:

> Also, isn't there a hashtable representation that never keeps more than a
> single item in each top-level array element? I think you can do something on
> insertion where you bump an item down to the next available bucket when you
> find a bucket is full, only rehashing when items get more than N steps from
> their hash bucket... even if we settle on a general-purpose hashtable
> representation, it might be a good idea to support the above.

Yes, this is called "open addressing". It's described in Knuth.

I thought a lot about using open addressing for an STL-style hashtable
implementation, and I never figured out a reasonable way to do it.
Here are the problems that I saw:

 - Erasing is hard when you use open addressing. The problem is that
   an element with a given hash code might be in one of several
   places. If it's in slot B, you can only find it if you know that
   there's something already there in slot A. So what happens if
   you had something in slot A, and you've now erased it? The usual
   way around this is to set a bit in slot A saying that there was
   something there and now it's gone.
 - It doesn't interact very well with general C++ types, where you
   have to worry about invoking constructors and destructors.
 - It doesn't interact very well with automatic resizing. All of
   the discussions of open addressing that I've seen assume fixed
   sized tables.

In principle, the advantage of open addressing is that you can get
some space saving. I don't know if that's a big enough advantage to
be worth the pain.

                        --Matt


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk