|
Boost : |
From: Matthew Austern (austern_at_[hidden])
Date: 2001-01-08 12:21:28
David Abrahams wrote:
> Also, isn't there a hashtable representation that never keeps more than a
> single item in each top-level array element? I think you can do something on
> insertion where you bump an item down to the next available bucket when you
> find a bucket is full, only rehashing when items get more than N steps from
> their hash bucket... even if we settle on a general-purpose hashtable
> representation, it might be a good idea to support the above.
Yes, this is called "open addressing". It's described in Knuth.
I thought a lot about using open addressing for an STL-style hashtable
implementation, and I never figured out a reasonable way to do it.
Here are the problems that I saw:
- Erasing is hard when you use open addressing. The problem is that
an element with a given hash code might be in one of several
places. If it's in slot B, you can only find it if you know that
there's something already there in slot A. So what happens if
you had something in slot A, and you've now erased it? The usual
way around this is to set a bit in slot A saying that there was
something there and now it's gone.
- It doesn't interact very well with general C++ types, where you
have to worry about invoking constructors and destructors.
- It doesn't interact very well with automatic resizing. All of
the discussions of open addressing that I've seen assume fixed
sized tables.
In principle, the advantage of open addressing is that you can get
some space saving. I don't know if that's a big enough advantage to
be worth the pain.
--Matt
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk