Boost logo

Boost :

From: David Abrahams (abrahams_at_[hidden])
Date: 2001-01-06 16:03:28


----- Original Message -----
From: "Daryle Walker" <darylew_at_[hidden]>

> Since it seems that hash stuff is useful, maybe we could add hash-based
> classes to Boost.

Funny you should mention that. Beman has been looking into that for some
time, and I've recently had a idea:

There are lots of different ways to formulate a hashtable, but the
fundamental idea is that there's a randomly-accessible container of hash
buckets, each of which can contain an arbitrary number of items. You could
choose vector or deque (or something else?) for the array of buckets. Each
bucket could be represented by any number of different containers, with
different trade-offs. Sound familiar? To me it sounds a lot like the BGL
adjacency list, and a ripe area for the use of selectors.
Overgeneralization? Quite possibly; I don't know.

Also, isn't there a hashtable representation that never keeps more than a
single item in each top-level array element? I think you can do something on
insertion where you bump an item down to the next available bucket when you
find a bucket is full, only rehashing when items get more than N steps from
their hash bucket... even if we settle on a general-purpose hashtable
representation, it might be a good idea to support the above.

> What about red-black trees? I don't know what they are, but I've heard
that
> they're used in making hashes, sets, etc.

Not hashes, usually, but sets, maps, etc. Your std::set is almost certainly
implemented using one.

> so maybe we can make one of those
> too. (Where can I find out about them?)

A google search reveals this intro, which seems about right:
http://swww.ee.uwa.edu.au/~plsd210/ds/red_black.html

Funny you should mention rb-trees. My recent work on binary searches has led
me to think about which rb-tree representation we'd want. I've begun to
think that the approach used in most standard library implementations is
probably not the right one to expose to users; the comparison function
shouldn't be part of the definition. What we really need is a balanced tree
container with upper_bound, lower_bound, find, and equal_range which accept
comparison function parameters. I think we also want some hooks into the
rebalancing algorithms so that one can implement an interval tree (an
example at
http://www.csua.berkeley.edu/~emin/source_code/cpp_trees/index.html, which I
can't vouch for)

-Dave


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk