Boost logo

Boost :

From: Amit (contact.lipik_at_[hidden])
Date: 2008-01-28 10:50:12


Daniel James <daniel_james <at> fmail.co.uk> writes:
> I should add that they were accepted too late to be included in 1.35.
> So if you're expecting them in the next release, you'll just find more
> disappointment. Sorry about that.
>

I am using unordered_maps now, just added the the code posted for review to
1.34.1 branch.
My app is heavily multithreaded (web server), has 2 huge unordered maps keyed on
pair of pointers. Did a load test this morning, appears to work fine. I use
Hsieh's hash function (http://www.azillionmonkeys.com/qed/hash.html). I'll
upgrade to the SVN version if there have been any changes since the review.

As an aside, my usage exposes a case where unordered lose miserably to ordered:

I have a lexicon of words (wstring). Typically it has 20K entries (words in a
given language, so well distributed), and is looked up often. Storing this as an
unordered_map (with either boost:hash or Hsieh's hash) results in significantly
lower lookup performance than in an ordered container.

I am guessing this is because most of my lookups succeed. If set has N strings
and looked up string has size s, lookup is O(log(N)(1 + sk)) character compares
in an ordered set, where k is the proportion of strings with size=s; typically k
<< 1. While lookup in the hash set is O(C(1 + sj)) character compares + O(s) bit
twiddles to calculate the hash, where C is the average chain length & j is the
proprtion of strings in the right chain with size=s; j < 1. Since we intuitively
expect k < j, and O(bit shift) > O(compare), ordered sets would win for lookups
that succeed most of the time (unordered should be better if lookups failed most
of the time, since it would reject without any comparisons).

Just thought I'd share in case anyone is contemplating using unordered_sets for
dictionaries :)


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk