Boost logo

Boost :

Subject: Re: [boost] Is Boost.Range broken?
From: Dave Handley (dave_at_[hidden])
Date: 2008-11-23 15:41:09


Response inlined - thread snipped where we agree :)

David Abrahams wrote:

>
> What's your problem with this thread? The problem is certainly getting
> a lot of attention.

My main issue with this thread is not the attention it received - it's the
extent to which a discussion about a breaking change to a library has turned
into a philosophical discussion. Really the discussion about what a range
is should have taken place at review time for the library, now we should
just be able to confirm or not that a breaking change has happened, and
either fix it or not.

>
>>>>>> Perhaps. But did the documentation guarantee that it would work?
>>>>
>>>> I believe it did - looking at a quote from Tom elsewhere in the thread.
>>>
>>> Agreed.
>>>
>>>>> You're free to define models of Range that have a default-constructed
>>>>> empty state. Requiring all models of Range to behave that way is
>>>>> antithetical to the principles of generic programming.
>>>>
>>>> And that isn't what I'm arguing for.
>>>
>>> Maybe not, but the argument that a supposed model of Range is broken
>>> because it is not sufficiently container-like is headed in that
>>> direction.
>>
>> Not really, there are plenty of libraries around there, specifically
>> designed for generic programming, where different entities do
>> different things. That's why the standard defined all the different
>> types of iterators for example - random access iterators are different
>> to bi-directional iterators in many ways, but are still called
>> iterators. Just because you have some ranges where default
>> construction implies empty, doesn't mean all of them have to.
>
> Sure. The argument I was responding to included statements like,
>
> "This implies to me that range is trying to look and feel like a
> container - not like an iterator."
>
> and
>
> "...if you think of a range as being more like an iterator than a
> container, but I would argue that it is more like a container."
>
> Those (maybe unintentionally) are general statements about the range
> concept rather than a specific one about iterator_range.
>

This is all based on the following - the first line in the *current* docs
for boost.range.

"A Range is a concept similar to the STL Container concept."

The same pre-amble goes on to talk about the differences between a range and
a container; for example it says:

"In particular, a Range does not necessarily

    * own the elements that can be accessed through it,
    * have copy semantics, "

It doesn't say that a range doesn't have a concept of emptiness, or that it
has properties of an iterator. This is why I have been contending that
ranges are more container like than they are iterator like. I'm not making
this up - it's what the docs tell me.

>>>>>> In general, range sits in a strange middle ground. It is trying to
>>>>>> be a clever pair of iterators, and also trying to look and feel like
>>>>>> a container.
>>>>>
>>>>> I disagree. IMO Range is a pair of iterators You can see that by
>>>>> looking at all of the models that don't behave in any container-like
>>>>> way, e.g. std::pair<int*,int*>, and if you read any book on the STL
>>>>> you'll even see pairs of iterators referred to as "ranges." The key
>>>>> feature of a Container that distinguishes it from a Range is its
>>>>> ownership of the values: when you copy it, the values are copied, too.
>>>>
>>>> One feature of some containers is that. Some containers (not in the
>>>> STL) don't own the data.
>>>
>>> Then you're not talking about the STL container concept
>>> (http://www.sgi.com/tech/stl/Container.html) and I don't know what you
>>> *are* talking about when you say "container."
>>
>> When I say container I mean any container that anyone has written.
>
> In other words, it's a term without a solid definition?

Yes - just like most things in computer science. One of the great things
about being a library writer is that people end up using your stuff for
things you hadn't thought about.

>
>> I usually try (although don't always succeed) to distinguish this from
>> the example you are giving above, by talking about standard library
>> containers in that context.
>
> I'm not talking about standard library containers, in the sense of
> vector, list, deque, etc. I'm talking about the standard library
> definition of what it means to be a container.

In the context of the standard library. The standard library doesn't have a
monopoly on the term container.

>
>> I have personally written containers that don't necessarily own data.
>
> What makes them "containers," then? What do you consider container-ness
> to be, if it doesn't include element ownership?

I consider container-ness to be a wide range of possible features. It could
be defining ownership, or ordering, or anchoring a particular object into a
context (for example a container of smart pointers to objects can be
considered to be containing the smart pointers, but can also be considered
to be preventing the underlying objects from being deleted elsewhere in a
more abstract way).

>
>> I don't think containers in general have a single "defining feature".
>> I agree that owning the data is a common feature of many containers,
>> but so is the interface they have. A lot of containers have a
>> superset of the interface we see on boost::iterator_range.
>
> If you're going to argue about peoples' expectations for a category of
> types based on its similarity to some notion of "container," you at
> least have to make reference to some commonly-understood definition of
> the term. I chose the one in the standard, because that's what we all
> have access to.

Ok then - based on the contents of the docs here:
http://www.boost.org/doc/libs/1_37_0/libs/range/doc/range.html what does
that page mean? I'm basing the similarity on what the docs tell me to
expect - are the docs wrong?

>
>>>> An iterator range is also not an iterator,
>>>
>>> Never claimed it was. It's a range (see
>>> http://www.sgi.com/tech/stl/Iterators.html)
>>
>> I know it's a range. Incidentally, the link you have doesn't define a
>> range - it just refers a lot to ranges (and mentions a few properties
>> of them).
>
> This is a definition of "range of iterators."
>
> Most algorithms are expressed not in terms of a single iterator but in
> terms of a range of iterators [1]; the notation [first, last) refers
> to all of the iterators from first up to, but not including, last. [2]
> Note that a range may be empty, i.e. first and last may be the same
> iterator. Note also that if there are n iterators in a range, then the
> notation [first, last) represents n+1 positions. This is crucial:
> algorithms that operate on n things frequently require n+1
> positions. Linear search, for example (find) must be able to return
> some value to indicate that the search was unsuccessful.

It defines a notation for half open ranges, specifically in the context of
iterators. That's not the same as a definition of a range. It is most
certainly not a generic definition of a range of the like that boost.range
is trying to be.

>>>> as well as not being a container.
>>>>
>>>>> It's unfortunate that the original implementation of iterator_range
>>>>> set up different expectations for you, but AFAICT the only thing that
>>>>> a Range has in common with a container is that it supplies a begin()
>>>>> and end() that delimit a sequence of elements.
>>>>
>>>> And an empty() and a size()
>>>
>>> Okay, fine, but with a different (free function) syntax.
>>
>> empty() and size() on an iterator_range are both member functions - so
>> I'm not sure what you mean here. Boost::iterator_range also provides
>> front and back.
>
> Yes, but now you've switched backfrom talking about the range concept
> to talking about a specific model of that concept.

Are you surprised? That's the original topic of this thread, and the key
point. Again - this is precisely why I stopped posting to boost development
a long time back - a simple problem becomes a lengthy philosophical
discussion. I'm not talking about anything complex here. I've read the
docs, I've looked at the code, I've used the code. Somewhere along the way
we seem to now need exact definitions of iterator, container, range, and a
whole host of other things in order to figure out how to resolve an
undocumented breaking change.

>
>>> Right. Since you want to talk about what should drive the design of
>>> iterator_range, I'll tell you what's relevant to this discussion:
>>> iterator_range should represent a range built on iterators as its name
>>> implies (see the link I provided) and should not make any promises that
>>> force it to have space or time overhead beyond what a pair of iterators
>>> costs.
>>
>> I disagree - what should drive the design of iterator_range is what
>> people will find useful.
>
> I think there are any number of designs people might find useful,
> including ones that deviate substantially from either one under
> discussion. I am saying that we have an unfortunate precedent to
> contend with, so we may or may not be able to supply the ideal design
> under that name, but if we were starting from scratch, a Boost component
> called iterator_range that contains two iterators delimiting a range of
> elements should meet peoples' expectations by following the design
> principles of generic programming, and the extra bool simply clashes
> with those principles.

Disagreed.

>
>> The link you provided uses a range - it doesn't define it. The
>> library is free to define ranges to have the behaviour that is
>> considered useful.
>
> I disagree, and for the same reason I argued that the filesystem library
> mustn't have a function called "base_name" that meant something
> fundamentally different from Posix's base_name. In our domain, "range"
> has an accepted meaning.

We will have to continue to disagree here.

>
>> In this case different people are asserting that different behaviour
>> is useful - in your case you assert that time and space constraints
>> are important, Tom has shown that valid empty ranges are useful.
>
> I assert that both are useful.

Agreed.

>
>> The most obvious solution to this is 2 versions of the iterator_range
>> class.
>
> Yes, if Tom needs the old behavior, he should have it. The question is,
> how should he get it? He could write it himself, or Boost could provide
> it in any number of ways.
>
> I think if I were the Range library maintainer, I would feel responsible
> enough for my initial design mistake to have made this transition a lot
> easier for my users, and I'd be bending over backwards to make up for
> the breakage. I would also be very reluctant to maintain the old design
> as anything more than a deprecated feature, though, because --- in my
> opinion --- it is fundamentally flawed.

So by that statement, you are saying that Tom's use case, which relied on
the old behaviour, is fundamentally flawed? I'm very surprised by that. I
think we are going to have to continue to disagree here.

>
> That said, it's not up to me as long as we're going to respect
> Thorsten's right to maintain his own library.
>
>>> Of course, it may be too late for that if people have been relying on
>>> the earlier behavior.
>>>
>>>> It is; however, what the library intends it to be.
>>>
>>> Hm?
>>
>> That assertion comes from the comment in the code that says
>> iterator_range is container-like.
>
> Apparently the "intentions of the library" are a bit like the shifting
> sands.

Why then do the docs still assert that a range is container-like? That's
the 1.37 docs, not the 1.34 docs.

>
>>>> Finally, we can discuss detailed semantics of what an iterator_range
>>>> should be all we like, the core issue here though is that
>>>> iterator_range changed unannounced between 2 versions of boost.
>>>
>>> Well, that's certainly a separate issue, and a real problem, and one
>>> that needs to be addressed somehow.
>>
>> Agreed.
>>
>>>
>>>> Why can't we come to an agreement that 2 versions of iterator_range
>>>> would make some sense, and move this discussion forward?
>>>
>>> Because I'm not yet convinced that's the way to go. Maybe the best
>>> solution is a preprocessor switch that restores the old behavior.
>>
>> In that context you would be telling Tom that because he wants the old
>> behaviour in some of his code, he would be banned from using the new
>> behaviour elsewhere in his code? Is that your intention?
>
> Whoa there, Mr. Prosecutor! I'm not intending to ban anything; I am
> merely exploring the solution space.

If you are going to use a pre-processor switch to change the behaviour then
that does ban someone from using both versions. Having 2 separate versions
available wouldn't - I'm only expressing a serious dislike for turning off
behaviour with pre-processor switches. I'll use another example from
another boost library that I dislike. In boost.smart_ptr (a library I
really like and use a lot), the thread safety is turned on and off with a
pre-processor switch. This means that in my entire firm, since we might
need thread safe boost shared_ptr everyone has to have the overhead of using
it. We can't mix thread safe and non-thread safe shared_ptr. If we use
pre-processor switches for new and old behaviour of boost iterator_range we
create just the same problem. And since, as we have identified elsewhere in
this thread, the new and old behaviour are to a certain extent mutually
exclusive, then we would be disallowing usage of both versions.

>
> Incidentally, Tom cannot have both behaviors under the same name
> (including namespace) without either violating the ODR or introducing
> still more runtime overhead, thread safety problems, and other global
> state evil. If he is going to use both behaviors, I'd prefer he use the
> new behavior under the existing name and use a different name elsewhere.
> If that was indeed a viable option for him (though I expect it's not),
> the workaround would be easy: provide his own old_iterator_range

I realise that - which was why I believe 2 versions under different names is
by far and away the best way to resolve this. From my perspective I don't
much care what they are each called, but I do strongly believe (in the
absence of a better solution) that 2 versions is the way forward.

Dave


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk