Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [unordered] unordered_set::erase() complexity bug?
From: Thorsten Ottosen (thorsten.ottosen_at_[hidden])
Date: 2009-11-23 18:40:57

Next message: Steven Watanabe: "Re: [boost] [Review] GGL review starts today, November 5th"
Previous message: Emil Dotchevski: "Re: [boost] [new Warningspolicy] MSC4180 ontheMaintenance Guidelines"
In reply to: Daniel James: "Re: [boost] [unordered] unordered_set::erase() complexity bug?"
Next in thread: Beman Dawes: "Re: [boost] [unordered] unordered_set::erase() complexity bug?"
Reply: Beman Dawes: "Re: [boost] [unordered] unordered_set::erase() complexity bug?"

Daniel James skrev:
> 2009/11/23 Christopher Jefferson <chris_at_[hidden]>:
>> On 23 Nov 2009, at 23:23, Stefan Strasser wrote:
>>
>>> I can't think of a reason why it should be O(n) in this case (perfect hash,
>>> erase doesn't rehash).
>>> but even if it was, the results should be at worst quadratic.
>> I suspect the problem you are seeing is that erase returns an iterator to the next member of the unordered_set, which is a possibly O(n) search if the set is sparsely populated, and in particular if, as you get here, you erase the elements in a particularly bad order. I'm not sure why this doesn't occur with the intrusive containers.
>
> That's right, I profiled it and it spends all the time searching for
> the next bucket.
>
>> One way to significantly reduce this problem is to use a different hash function, which shuffles the values more.
>
> I could apply another hash function to the hash value to make this
> less likely, or perhaps store a pointer to the last non-empty bucket.
> I'll think about it.
>
> A workaround is to erase by key or iterator range if possible.

Our very own JoaquÃn anticipated this problem very early:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2023.pdf

"In pathological cases, h(n) can be as bad as Â½ n(n + 1): this happens
when the elements
are erased in reverse order as they are arranged in the hash table."

It would be interesting to know

1) how intrusive containers avoid this, and

2) also if this is a real problem in real code.

At any rate, Joaquin shows that the *lower bound* complexity of erase is
actually lg n, and not O(1).

Does anyone know how the WG21 LWG responded to his paper?

-Thorsten

Next message: Steven Watanabe: "Re: [boost] [Review] GGL review starts today, November 5th"
Previous message: Emil Dotchevski: "Re: [boost] [new Warningspolicy] MSC4180 ontheMaintenance Guidelines"
In reply to: Daniel James: "Re: [boost] [unordered] unordered_set::erase() complexity bug?"
Next in thread: Beman Dawes: "Re: [boost] [unordered] unordered_set::erase() complexity bug?"
Reply: Beman Dawes: "Re: [boost] [unordered] unordered_set::erase() complexity bug?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk