Boost logo

Boost :

From: Daniel James (daniel_james_at_[hidden])
Date: 2007-12-10 15:54:30


Thanks for the review, there's a lot to act on.

On 10/12/2007, Hervé Brönnimann <hervebronnimann_at_[hidden]> wrote:
>
> Definition of operator==:
> ---------------------------------
>
[snip]
>
> I find it unfortunate that it's not defined in the standard. Howard
> gives the same implementation I would have given, which works in O(N)
> time.

And only works for unordered_set. How would you define equality for
unordered_multimap? Do you compare the mapped values, does the order
of the values matter? If it doesn't then the comparison becomes more
expensive, if it does then it's a bit odd for an 'unordered' container
(although this implementation does preserve insertion order, if it
didn't then things would be worse). Is it okay to define equality in
terms of the equality predicate for keys, but operator== for values?

> I found also the following bogus note in the Working Draft:
>
> > 10 Unordered associative containers are not required to support the
> > expressions a == b or a != b. [ Note: This is because the container
> > requirements define operator equality in terms of equality of
> > ranges. Since the elements of an unordered associative container
> > appear in an arbitrary order, range equality is not a useful
> > operation. — end note]

That's not bogus at all. An equality operator for unordered containers
would have completely different semantics to all other containers
(especially when using custom comparisons, comparing associative
containers can have surprising results).

> Daniel, seems to me that you'd have the liberty to implement it, and
> if you chose to not name it operator==, that'd be fine too. It would
> be a boost extension. But it really irks me that an unordered
> container is considered not to have a value (can't have a value if
> you can't compare them) when there's no good reason not to provide
> operator==.

It isn't true that you can't compare them. In fact, you can implement
your own comparison that would be pretty much as efficient as any that
I provided.

Right now, I'm trying to limit myself to obvious design decisions for
extensions so I don't really want to go into this area.

> The prime number list is too short, it roughly doubles every time.
> On the other hand, I understand that it wouldn't do that have a
> *much* larger list, as this static array is included in every
> translation unit.

I think it can actually cause an ODR violation so I probably should
change that anyway. David Abrahams recently posted a way to avoid this
by including the value in a template, so I'll try that.

> One possibility for a much finer one would be to
> make unordered an object library, included in the build. I
> understand if you want to keep it "include-only", but could you at
> least enlarge the list to have an increase of 30% instead of 100%?

How much of a difference would it make? The memory requirements for
the array of buckets is typically smaller than the memory required for
the nodes (unless you have a small number of nodes or increase the
maximum load factor). The primes that I'm using apparently give better
results because they are as far away as possible from powers of two
(I'm pretty sceptical about that though).

But I don't really have a strong opinion one way or the other, so if
no one disagrees, I'll add more primes.

> In the next_prime and prev_prime functions, please do not use the
> hard constant 28. Use instead
> static const int num_primes = sizeof prime_list / sizeof
> *prime_list;

Okay.

> What exactly is the use of pair_cast? There must be a legitimate
> reason that I don't quite understand. The pair template conversion
> constructor should suffice, no?

It works around some of the less standard compliant STL implementations.

> Re: swap: I really think you should be following the committee's
> current advice (I'm looking at the code of hash+table_impl.hpp, lines
> 1194 -- 1219), sooner rather than later. But my opinion is biased.

Why is your opinion biased?

> At least you state precisely what you're doing. This is not a
> criticism of your code. But such change might break your client's
> code, if the standard later codifies its current decision. It seems
> that changing the semantics of swap before unordered is accepted and
> released with Boost, would save some grief later.

That works both ways, as the committee could change its mind. Since we
don't have concepts, following their current decision means slow
swaps. Which isn't that bad since it's only for the rare occasions
when the allocators aren't equal. I haven't got a strong preference
here either, but this was discussed before and the general consensus
seemed to be for what I've done.

> There is some convention in boost that members begin with m_... or
> s_... but I don't think it's very important. Your system (the
> trailing underscore) is fine. Yet, these things show in a debugger.

If there's a convention, it's not followed widely. There are a lot of
boost libraries that use a trailing underscore for members.

>
> Minor comments on the documentation:
> ------------------------------------------------------

I'll act on all of these, a few weak excuses below.

> In hash_equality.html:
>
> * This is not quite true: "An example implementation of FNV-1, and
> some other hash functions are supplied in the examples directory."
> Only FNV is provided... After all, these unordered containers are
> only useful if you have a good hash function to go with it. You
> should definitely mention the Boost.Hash library, which provides
> hashes for these unordered containers. Nevertheless boost.hash does
> not give you tradeoff between speed and good hashing, which is why
> other more specialized hashes could be provided. Besides FNV-1, I
> found Robert Jenkins' (http://www.burtleburtle.net/bob/hash/
> evahash.html) quite good.

I wrote that because I had actually implemented some other hash
functions but then I didn't include them because I was worried about
copyright. But Bob Jenkins's hash functions are public domain, so I'm
not sure why I was worried about them. I'll add something based on his
functions. And I'll check on some of the other ones I was considering.

> * I am always reminded of Matt Austern's column (http://lafstern.org/
> matt/col2_new.pdf) for the example you give. I went back to get
> most o the points, and then read your implementation. Unfortunately,
> your functor implementation falls prey to the default pointed out in
> Matt's article: std::toupper depends on the global locale, so that if
> I call setlocale between two calls to your functor, I might get
> different results. Be sure to check out Matt's column.

Good point. The example is actually a simplified version of
'libs/unordered/examples/case_insensitive.hpp' which doesn't have this
problem (well, it wouldn't if it didn't have a bug which is fixed in
subverison). I probably simplified it too far.

> * "Swap: I followed Howard Hinnant's advice and implemented option
> 3.": Huh? Were are these options 1, 2, and 3 described? You should
> give a link to http://www.open-std.org/JTC1/SC22/WG21/docs/papers/
> 2004/n1599.html.

There is a link but it's cunningly hidden in the heading. I never
noticed that the boost documentation stylesheet obscures it. I'll add
visible links.

Daniel


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk