Boost :

Date view	Thread view	Subject view	Author view

From: Thorsten Ottosen (thorsten.ottosen_at_[hidden])
Date: 2008-04-21 04:39:48

Next message: John Maddock: "Re: [boost] Disabling compiler warnings in tests (was Filesystembug report)"
Previous message: Anthony Williams: "Re: [boost] shared_mutex deadlocks?"
In reply to: Daniel James: "Re: [boost] [unordered] request for ways to get more space efficient allocations"
Next in thread: Daniel James: "Re: [boost] [unordered] request for ways to get more space efficient allocations"
Reply: Daniel James: "Re: [boost] [unordered] request for ways to get more space efficient allocations"

Daniel James skrev:
> On 19/04/2008, Thorsten Ottosen <thorsten.ottosen_at_[hidden]> wrote:
>> Hi Jeremy,
>
> Jeremy wrote the original version, I'm responsible for the library nowadays.

Of course, sorry about that.

>> I tried expanding your list of primes so that it contains more numbers
>> in the small range and in particulat such that it contains numbers
>> closely above powers of 2. It happens frequently that powers of 2 comes up.
>
> The theory is that you get less collisions when the size is as far
> from a power of 2 as possible. But I haven't looked into the research
> that this is based on, so I'm not sure how trustworthy it is.
>
>> Here's my new list:
>>
>> static const std::size_t prime_list[] = {
>> 5ul, 11ul, 17ul, 29ul, 37ul, 53ul, 67ul, 79ul,
>> 97ul, 131ul, 193ul, 257ul, 389ul, 512ul, 769ul,
>> 1031ul, 1543ul, 2053ul, 3079ul, 6151ul, 12289ul, 24593ul,
>> 49157ul, 98317ul, 196613ul, 393241ul, 786433ul,
>> 1572869ul, 3145739ul, 6291469ul, 12582917ul, 25165843ul,
>> 50331653ul, 100663319ul, 201326611ul, 402653189ul, 805306457ul,
>> 1610612741ul, 3221225473ul, 4294967291ul };
>
> It was requested that I use smaller increments during the review. I
> haven't looked at it yet, but I created a ticket and I should get
> round to it before 1.36:
>
> http://svn.boost.org/trac/boost/ticket/1710
>
>> For the test I conducted, not once did I get a max load factor above 1
>> for containers of size 4, 5, 7, 9 etc up to about 512.
>
> It's written so that it rehashes whenever an insert will cause the
> maximum load factor to be exceeded. So you never will. The main
> performance worry is that it will rehash too often

Right.

> so It might be
> necessary to either let it exceed the maximum load factor a little, or
> ensure that it always grows by at least a certain percentage.
>
> Although for larger sizes you'll generally find that the memory use of
> the nodes (the element size, one or two pointers and the memory
> management overhead for each node) is much larger than the size of the
> bucket array (one pointer per bucket and the overhead for a single
> allocation).

Ok. I assumed wrongly that the buckets where not a pointer to something.

>> Another alternative would be to allow users to control the initial
>> bucket coutn explcitly in the constructor. I think I might like this
>> better. Something like
>>
>> boost::unordered_set<T*> the_set( boost::unordered::bucket_count(32u) );
>> assert( the_set.bucket_count() == 32u );
>
> The problem is that performance will typically degrade when the bucket
> count isn't a prime number. Although if you have a high quality hash
> function this shouldn't be the case.
>
> If you did this, would you want the bucket count to remain fixed,
> regardless of how large the container gets?

My typical use case is that I will fill the container once
(and I know how many elements that will be put in there). Then this
object is used again and again for lookup. In this case, allocating room
for 2-3 times the need seems like overkill, and perhaps only allocating
20%-30% extra would be more fair.

> An alternative that has some appeal is letting the user specify a
> policy class to control the bucket count. Although I'm not sure how
> that would play with the standard.

Poor, probably.

I guess for cases where I know the intergers fall in the range, say,
[0,64], I can just use vector<optional<unsigned>>.

-Thorsten

Next message: John Maddock: "Re: [boost] Disabling compiler warnings in tests (was Filesystembug report)"
Previous message: Anthony Williams: "Re: [boost] shared_mutex deadlocks?"
In reply to: Daniel James: "Re: [boost] [unordered] request for ways to get more space efficient allocations"
Next in thread: Daniel James: "Re: [boost] [unordered] request for ways to get more space efficient allocations"
Reply: Daniel James: "Re: [boost] [unordered] request for ways to get more space efficient allocations"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk