Boost logo

Boost :

Subject: Re: [boost] Boost.Algorithm design question
From: Dave Abrahams (dave_at_[hidden])
Date: 2011-10-06 21:42:30


on Thu Oct 06 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:

> [Marshall Clow]
>> So, what do people prefer (and why?):
>
>> template<typename InputIterator, typename V>
>> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
>> or
>> template<typename InputIterator, >
>> bool none_of_equal ( InputIterator first, InputIterator last,
>> iterator_traits<InputIterator>::value_type const &val )
>
> [STL]
>> #1 is better. It follows the STL's conventions (e.g. look at
>> std::find()) and permits heterogeneous comparisons.
>
> [Dave Abrahams]
>> Sorry, STL. The other STL's conventions are wrong. :-)
>
> I guess it's a question of design philosophy - like how users of
> Loki's smart pointers say "yay policies" and users of
> boost/std::shared_ptr say "boo policies".

Well, no. I'm trying hard to represent the views and theory of the
STL's original author. I could just as easily look at it your way. But
I think there's powerful and underappreciated validity to viewing
algorithms as abstractions, and you can't really do that if the
implementation leaks out in the specification.

> The design philosophy that I'm advocating is "preserve the user's
> types whenever possible". Here, a range of T and a value of V are
> involved. #1 says "all I need is t == v to return something
> convertible to bool, and I'll tell you whether this is 'false' for all
> elements in the range." It doesn't attempt to convert T to V or V to
> T. It doesn't attempt to say v == t, or !=, or anything else. It
> passes no judgment on what t == v "means". Perhaps t == v does attempt
> to convert T to V or V to T, but that's up to the types involved.

Yeah, but do you see how this gets the user involved with the
implementation details of the algorithm? You can get away with it with
these algorithms because they're so simple, but even some simple
algorithms run into problems with this way of doing things. I don't
remember the details right now but it started to show up when we were
working with concepts, which have the effect of (partially) check the
validity of your specifications. There were algorithms that allowed all
kinds of heterogeneity and conversions in their implementations, but
wouldn't compile with the concept constraints given in the standard. If
you tried to come up with concept constraints that actually would
compile without breaking backward compatibility it turned out to be
horrible. I think find_if might have been one of them. Doug Gregor
knows for sure.

> Here's another example. Consider vector<unsigned char> and val =
> 0x1234UL. For all i, v[i] == 0x1234UL is false (the usual arithmetic
> conversions widen unsigned char to unsigned long). #1 will report that
> none of the unsigned chars are equal to 0x1234UL. But #2 smashes
> unsigned long to unsigned char, which must compile (possibly emitting
> a warning), and then the unsigned chars will be compared to 0x34.

Well, this is the algorithm needing to accomodate a weakness of the core
language (implicit narrowing conversions). And since we don't live in
an ideal world, maybe you have a point.

> [STL]
>> Heterogeneous comparisons are surprisingly popular, and the STL is
>> moving to support them better. For example, std::lower_bound() didn't
>> support them in C++03, and does in C++11.
>
> [Dave Abrahams]
>> /me realizes he is responsible for this change
>> But that's *totally* different.
>> /me fumbles around for the explanation
>> For the version that uses "<", it's an unintentional side-effect of
>> making the specification uniform. The function signature already took
>> an unconstrained object by reference, and my paper was not trying to
>> change that. It just happened to make the /semantics/ of using a type
>> different from the value type of the sequence meaningful in that case.
>> It also make the specification *simpler* overall and easier to
>> understand.
>
> Heh. The new specification is great (as a Standard Library maintainer,
> I *really* appreciate the clarity of thought in the C++11
> wording). I'm just saying I think it's greater than you realize.
>
>> The _purpose_ was to support heterogeneity for the version that uses a
>> function. The description in terms of partitioning is a more principled
>> and generic way to describe the actual requirements of the algorithm.
>> There was never any intention to promote the use of heterogeneous "<",
>> which again, doesn't make sense.
>
> It does make sense. For example, consider string and const char * (and
> it's not like I prefer const char *, it's just extremely common and
> users use it all the time).

True.

> I could have a lexicographically sorted sequence of const char *. I
> have a std::string (say its contents are "meow") and I'd like to
> perform a lower_bound() with op<(). The sequence is partitioned with
> respect to e < string("meow"), so C++11 lower_bound() handles this
> just fine. (C++03 technically did not, and VC8/9 would complain in
> debug mode that the sequence was not sorted by op<(), which is why
> I've thought about lower_bound() a lot...)
>
> The trick here is that op<() means something very different for (const
> char *, const char *) than it does for (const char *, string),
> (string, const char *), and (string, string).

Yeah, that's a trick. That nonuniformity is not encouraging.

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk