Boost logo

Boost :

From: Hervé Brönnimann (hervebronnimann_at_[hidden])
Date: 2007-12-09 21:42:59


First of all, this is a fine implementation. I'm just curious why,
since most compilers now provide a tr1 implementation, if it is just
historical or if this implementation claims to be better than some
other existing TR1 implementations. But with the purpose of fitting
Boost.Tr1, this will fulfill the purpose. And great care was taken
to be standard compliant. I vote for accept.

I tried to spend a few hours looking at the code and reading the
documentation. This is what I find below. I haven't had the time to
even run an example or test the code, though. I limited myself only
to examination.

Overall, the focus and hard-work (esp. with exception safety and its
testing) is impressive. It seems you've taken care of a lot of
standardese aspects. Kudos.

Definition of operator==:
---------------------------------

One small issue I see, is that you should define operator==. I
searched a bit, and found Howard's statement (on http://
www.velocityreviews.com/forums/t291501-stdset-versus-
tr1unorderedset.html):

> I unsuccessfully tried to get an operator== into tr1::unordered_*.
> Assuming a good hash function such an operation can have the
> intended semantics and O(N) complexity."

I find it unfortunate that it's not defined in the standard. Howard
gives the same implementation I would have given, which works in O(N)
time. I found also the following bogus note in the Working Draft:

> 10 Unordered associative containers are not required to support the
> expressions a == b or a != b. [ Note: This is because the container
> requirements define operator equality in terms of equality of
> ranges. Since the elements of an unordered associative container
> appear in an arbitrary order, range equality is not a useful
> operation. — end note]

As if they couldn't specify in the Table "Additional requirements" a
different post-condition, or alternately, state the post-condition in
terms of range only in the Tables for sequence and ordered
associative containers, and put in a different post-condition for
unordered containers. Howard also states:

> The final decision was to let implementors provide it as an
> extension, but not to require it.

Daniel, seems to me that you'd have the liberty to implement it, and
if you chose to not name it operator==, that'd be fine too. It would
be a boost extension. But it really irks me that an unordered
container is considered not to have a value (can't have a value if
you can't compare them) when there's no good reason not to provide
operator==.

Comments on the code:
--------------------------------

I really appreciate the care you've taken wrt exception safety.

The prime number list is too short, it roughly doubles every time.
On the other hand, I understand that it wouldn't do that have a
*much* larger list, as this static array is included in every
translation unit. One possibility for a much finer one would be to
make unordered an object library, included in the build. I
understand if you want to keep it "include-only", but could you at
least enlarge the list to have an increase of 30% instead of 100%?

In the next_prime and prev_prime functions, please do not use the
hard constant 28. Use instead
    static const int num_primes = sizeof prime_list / sizeof
*prime_list;

What exactly is the use of pair_cast? There must be a legitimate
reason that I don't quite understand. The pair template conversion
constructor should suffice, no?

Re: swap: I really think you should be following the committee's
current advice (I'm looking at the code of hash+table_impl.hpp, lines
1194 -- 1219), sooner rather than later. But my opinion is biased.
At least you state precisely what you're doing. This is not a
criticism of your code. But such change might break your client's
code, if the standard later codifies its current decision. It seems
that changing the semantics of swap before unordered is accepted and
released with Boost, would save some grief later.

There is some convention in boost that members begin with m_... or
s_... but I don't think it's very important. Your system (the
trailing underscore) is fine. Yet, these things show in a debugger.

Minor comments on the documentation:
------------------------------------------------------

In buckets.html:

* The sentence "The + 1 is required because the container is allowed
to resize when the load factor is equal to the maximum load factor"
is unclear to me, although I think I figured out the intended
meaning. Resize is most certainly wrong here (size is not changed),
you meant rehash? I'd rephrase "The + 1 guarantees there is no
invalidation; without it, reallocation could occur if the number of
bucket exactly divides the target size, since the container is
allowed to resize when the load factor is equal to the maximum load
factor."

In hash_equality.html:

* This is not quite true: "An example implementation of FNV-1, and
some other hash functions are supplied in the examples directory."
Only FNV is provided... After all, these unordered containers are
only useful if you have a good hash function to go with it. You
should definitely mention the Boost.Hash library, which provides
hashes for these unordered containers. Nevertheless boost.hash does
not give you tradeoff between speed and good hashing, which is why
other more specialized hashes could be provided. Besides FNV-1, I
found Robert Jenkins' (http://www.burtleburtle.net/bob/hash/
evahash.html) quite good.

* I am always reminded of Matt Austern's column (http://lafstern.org/
matt/col2_new.pdf) for the example you give. I went back to get
most o the points, and then read your implementation. Unfortunately,
your functor implementation falls prey to the default pointed out in
Matt's article: std::toupper depends on the global locale, so that if
I call setlocale between two calls to your functor, I might get
different results. Be sure to check out Matt's column.

* the declaration
     boost::unordered_multiset<point, std::equal_to<point>,
point_hash> points;
looks wrong (order of template args).

* I wished you put links to boost.hash.

In comparison.html:

* The spelling parametrise is quite unusual (although correct, very
British!) but in any case incompatible with the C++ Working Draft,
which uses parameterize. I think you should adopt the most common
spelling. Same thing with amortized.

* "ordering comparison: No equivalent. No idea why." A bit of humor
doesn't hurt :)

* "Can be compared using the ==, !=, <, <=, >, >= operators: No
comparison operators are defined." That's a sticky point I mentioned
above.

* Complexity guarantees: You should state (recall?) what n, N, etc.
stand for. Esp since you use size() also.

* "Inserting a range of N elements: Average case O(N), worst case O(N
* 'size()')" single quotes and font around size()

* "Find: logarithmic Average case: O(N), Worst case: O(size())" I
think you meant "Average case: O(1)"

In rationale.hpp:

* "For example, an it..." : delete "an"

* "loosing the efficiency": losing efficiency. Spelling in other
places too

* "Swap: I followed Howard Hinnant's advice and implemented option
3.": Huh? Were are these options 1, 2, and 3 described? You should
give a link to http://www.open-std.org/JTC1/SC22/WG21/docs/papers/
2004/n1599.html.

* reference pages: the constructor that takes arguments (size_type =
impl-defined, ..) should be declared explicit, for all four
containers. The code is already doing the right thing.

--
Hervé Brönnimann
hervebronnimann_at_[hidden]

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk