Boost logo

Boost :

From: Yoni Lavi (l_yoni_at_[hidden])
Date: 2007-06-10 13:24:14


I hope I'm posting it to the correct newsgroup. I wanted to mail this
directly to the developer of boost::intrusive (Ion Gazta?aga), but I can't
find his e-mail anywhere, or any way to directly report the bug. I am not a
boost developer myself.

Position of a contained element modulo m (new_buckets_len) is known to
remain equal to its previous position modulo n (old_buckets_len) ONLY IF m <
n AND n %m == 0, but the latter condition isn't being checked. The code
assumes it can skip rehashing of m first buckets if m < n:

         //If we are shrinking the bucket array, just rehash the last nodes
         if(same_buffer && (old_buckets_len > new_buckets_len)){
            n = new_buckets_len;
         }
             //Iterate through nodes
         for(; n < old_buckets_len; ++n){
             ...
        }

Also, the following comment:
               //If this is a buffer expansion don't move if it's not
necessary
               if(same_buffer && new_n == n){
                  ++before_i;
               ...
Needs to be changed simply to "don't move if it's not necessary" - it needs
to be checked/may happen when shrinking the hash table also.

Here's a suggested bugfix:
- check if (old_bucket_len > new_buckets_len && 0 == old_bucket_len %
new_buckets_len).
- if true, i'th bucket needs to store the contents of all previous
buckets at positions i + j*new_buckets_len for j in [0, old_buckets_len /
new_buckets_len) spliced one after another. There is no need to check on a
per-element basis. I believe this optimization should be done even when the
rehash isn't in-place. A special case where this optimization is useful is
when using power-of-2 num_buckets.
- if false, buckets have to be re-scanned, using the existing code in
the body of the existing loop, but starting from n = 0 always:
         for(; n < old_buckets_len; ++n){ ... }

Please correct me if I'm wrong on this.. but just to be sure, I made a test
program that verified rehash() can break the container's invariants when
shrinking the container in place, which was indeed found to be the case.

As a totally unrelated note, I don't understand why iunordered_set doesn't
support a default constructor (so that buckets = 0 and num_buckets = 0,
given the existence of a rehash() function. I see it like the reset()
function in a scoped_ptr. If a "reset" style function is supported, the
class shouldn't force you to initialize it at construction time. Then again,
it's a matter of taste in design more than anything...


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk