Boost logo

Boost :

Subject: Re: [boost] [unordered] Buffered functions?
From: David Abrahams (dave_at_[hidden])
Date: 2008-11-26 19:25:04


on Wed Nov 26 2008, "Daniel James" <daniel_james-AT-fmail.co.uk> wrote:

> 2008/11/26 David Abrahams <dave_at_[hidden]>:
>>
>> on Wed Nov 26 2008, Ion Gaztañaga <igaztanaga-AT-gmail.com> wrote:
>>
>>> For example, unordered_multisets have an additional pointer to group
>>> equal values in a singly linked list so that bucket traversing is
>>> linear to the number of different values instead of the number of
>>> values in the bucket. It's a nice feature and useful in several
>>> scenarios (I even added that option to Intrusive unordered
>>> containers), but you have to pay for it ;-)
>>
>> Hmm, that doesn't make a lot of sense to me. Lots of equal values in a
>> hash table can quickly destroy its O(1) lookup characteristics.
>
> That's what it avoids. Because equal values are grouped together, you
> only need to examine a single key for all the equal values.

I understand.

> Here's a
> quick diagram of a bucket in an unordered_multiset (assuming that the
> hash function is terrible and 5,3 and 2 all end up in the same
> bucket):
>
> http://www.calamity.org.uk/x/bucket.png

Yes, I figured it was something like that. It still doesn't make sense
to me. If the number of repeated values in a bucket is small, it won't
hurt to traverse all of them. If it is really large enough that
traversing all equal values to get to the next value is expensive, it
will destroy O(1) lookup in any buckets with collisions. Did anyone
profile this in a real application and find it to be a win?

> The black links represent the normal single linked list. The red links
> are a circular list running backwards through equal values. This
> results in a red link from the first node in a group of equals values
> to the last node - this lets you quickly skip over equal values. This
> structure is easy to maintain and has some nice results, such as a
> stable order when inserting.

Stable order when inserting is nice, but ought to be achievable without
the overhead of an extra pointer per node.

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk