Boost logo

Boost :

Subject: [boost] [unordered] unordered_set::erase() complexity bug?
From: Stefan Strasser (strasser_at_[hidden])
Date: 2009-11-23 18:23:10


I think there is something very wrong with boost's implementation of
unordered_set/map.

1 unordered_set<int> s;
2 for(int c=0;c<nr;++c) s.insert(c);
3 for(int c=nr-1;c>=0;--c) s.erase(s.find(c));

results:

nr: time
20000: 0m0.244s
40000: 0m0.940s
60000: 0m5.584s
80000: 0m4.232s
100000: 0m34.814s
120000: 0m33.486s

the function that's causing this is erase(), which is specified to be O(1)
average, O(n) worst.
I can't think of a reason why it should be O(n) in this case (perfect hash,
erase doesn't rehash).
but even if it was, the results should be at worst quadratic.

load_factor() is below 1, bucket_count(x) is 1.

if line 3 is changed to:

3 for(int c=0;c<nr;++c) s.erase(s.find(c));

20000: 0m0.008s
40000: 0m0.016s
60000: 0m0.028s
80000: 0m0.032s
100000: 0m0.040s
120000: 0m0.044s

as you would expect.

even if I do the very same thing as above using Boost.Intrusive (using a load
factor of 1), I get expected results:

        struct element : intrusive::unordered_set_base_hook<>{
        ...

        for(int c=0;c<nr;++c) s.insert(*new element(c));
        for(int c=nr-1;c>=0;--c) s.erase(s.find(element(c)));

20000 0m0.008s
40000 0m0.016s
60000 0m0.024s
80000 0m0.024s
100000 0m0.032s
120000 0m0.040s

but surprisingly enough, the TR1 implementation shipped with libstdc++ also
has a problem like that:

20000: 0m0.368s
40000: 0m1.448s
60000: 0m1.668s
80000: 0m7.160s
100000: 0m7.808s
120000: 0m13.349s

am I missing something or is there really the same bug in 2 major
implementations of unordered_set?
it effectively caused a deadlock in some of my code.

I'm using 1.38. no bug report in Trac.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk