Boost logo

Boost :

Subject: Re: [boost] Is Boost.Range broken?
From: David Abrahams (dave_at_[hidden])
Date: 2008-11-23 14:38:29


on Sun Nov 23 2008, "Dave Handley" <dave-AT-dah.me.uk> wrote:

> David Abrahams wrote:
>
>>
>> on Sat Nov 22 2008, "Dave Handley" <dave-AT-dah.me.uk> wrote:
>>
>>> One of the biggest complaints I got when supported boost for my firm
>>> was about the documentation. Generally it just isn't up to the same
>>> standard as the code.
>>
>> I don't see it. It's my experience that the quality of the code and
>> of the documentation are strongly linked; maybe even tautologically
>> linked. If you can't tell what something is supposed to do from
>> reading the docs, how can its implementation possibly be correct in
>> any meaningful sense?
>
> So effectively you are saying that if I am correct in my assertion
> that boost documentation is frequently poor, then it follows that
> boost itself is poor. I can't say I agree - in general, even for
> those libraries where I've suffered poor documentation, I've still
> been pretty happy with the library (with a few exceptions).

I guess it's a matter of personal judgement, then.

>> I'm not saying all of Boost is well-documented of course; probably
>> some of the code isn't as good as I'd like. Also, not knowing the
>> speicifics of peoples' complaints, it's hard to understand much about
>> what constitutes "well-documented" from their point of view.
>
> If I get the chance, I may feed back some of the complaints I've
> heard. Although I haven't had good experiences with this list in the
> past (this thread being a good example).

What's your problem with this thread? The problem is certainly getting
a lot of attention.

>>>>> at the top of iterator_range we see:
>>>>>
>>>>> /*! \file
>>>>> Defines the \c iterator_class and related functions.
>>>>> \c iterator_range is a simple wrapper of iterator pair idiom. It provides
>>>>> a rich subset of Container interface.
>>>>> */
>>>>>
>>>>> This implies to me that range is trying to look and feel like a container -
>>>>> not like an iterator.
>>>>
>>>> I understand that you drew that conclusion, but IMO it's a huge stretch
>>>> to claim that a concept that doesn't even exist for containers
>>>> (singularity) should behave in some container-like way for ranges.
>>>
>>> It's not a huge stretch, it's just something that is useful for
>>> generic programming.
>>
>> That seems like a non-sequitur. The comment I'm replying to seems to
>> claim that Ranges are container-like, and thus their is_singular
>> function should behave in a container-like way. I still maintain that's
>> a huge stretch: there's just no a priori container-like behavior for a
>> function called is_singular, since there are no singular containers.
>
> At no point have I argued that the is_singular function makes a range
> more container like.

I wasn't claiming you did. I suggest we drop this point as I've been
unsuccessful thus far in communicating it and I don't much care whether
anybody understands it.

> All the other "container-like" functions make range more container
> like. Like begin(), end(), size(), empty(), operator[], etc. None of
> these functions exist on a standard library iterator.

No argument. Again, let's drop it unless you're really eager to
understand what I was saying.

>>>>> Perhaps. But did the documentation guarantee that it would work?
>>>
>>> I believe it did - looking at a quote from Tom elsewhere in the thread.
>>
>> Agreed.
>>
>>>> You're free to define models of Range that have a default-constructed
>>>> empty state. Requiring all models of Range to behave that way is
>>>> antithetical to the principles of generic programming.
>>>
>>> And that isn't what I'm arguing for.
>>
>> Maybe not, but the argument that a supposed model of Range is broken
>> because it is not sufficiently container-like is headed in that
>> direction.
>
> Not really, there are plenty of libraries around there, specifically
> designed for generic programming, where different entities do
> different things. That's why the standard defined all the different
> types of iterators for example - random access iterators are different
> to bi-directional iterators in many ways, but are still called
> iterators. Just because you have some ranges where default
> construction implies empty, doesn't mean all of them have to.

Sure. The argument I was responding to included statements like,

  "This implies to me that range is trying to look and feel like a
  container - not like an iterator."

and

  "...if you think of a range as being more like an iterator than a
   container, but I would argue that it is more like a container."

Those (maybe unintentionally) are general statements about the range
concept rather than a specific one about iterator_range.

>>>>> In general, range sits in a strange middle ground. It is trying to
>>>>> be a clever pair of iterators, and also trying to look and feel like
>>>>> a container.
>>>>
>>>> I disagree. IMO Range is a pair of iterators You can see that by
>>>> looking at all of the models that don't behave in any container-like
>>>> way, e.g. std::pair<int*,int*>, and if you read any book on the STL
>>>> you'll even see pairs of iterators referred to as "ranges." The key
>>>> feature of a Container that distinguishes it from a Range is its
>>>> ownership of the values: when you copy it, the values are copied, too.
>>>
>>> One feature of some containers is that. Some containers (not in the
>>> STL) don't own the data.
>>
>> Then you're not talking about the STL container concept
>> (http://www.sgi.com/tech/stl/Container.html) and I don't know what you
>> *are* talking about when you say "container."
>
> When I say container I mean any container that anyone has written.

In other words, it's a term without a solid definition?

> I usually try (although don't always succeed) to distinguish this from
> the example you are giving above, by talking about standard library
> containers in that context.

I'm not talking about standard library containers, in the sense of
vector, list, deque, etc. I'm talking about the standard library
definition of what it means to be a container.

> I have personally written containers that don't necessarily own data.

What makes them "containers," then? What do you consider container-ness
to be, if it doesn't include element ownership?

> I don't think containers in general have a single "defining feature".
> I agree that owning the data is a common feature of many containers,
> but so is the interface they have. A lot of containers have a
> superset of the interface we see on boost::iterator_range.

If you're going to argue about peoples' expectations for a category of
types based on its similarity to some notion of "container," you at
least have to make reference to some commonly-understood definition of
the term. I chose the one in the standard, because that's what we all
have access to.

>>> An iterator range is also not an iterator,
>>
>> Never claimed it was. It's a range (see
>> http://www.sgi.com/tech/stl/Iterators.html)
>
> I know it's a range. Incidentally, the link you have doesn't define a
> range - it just refers a lot to ranges (and mentions a few properties
> of them).

This is a definition of "range of iterators."

  Most algorithms are expressed not in terms of a single iterator but in
  terms of a range of iterators [1]; the notation [first, last) refers
  to all of the iterators from first up to, but not including, last. [2]
  Note that a range may be empty, i.e. first and last may be the same
  iterator. Note also that if there are n iterators in a range, then the
  notation [first, last) represents n+1 positions. This is crucial:
  algorithms that operate on n things frequently require n+1
  positions. Linear search, for example (find) must be able to return
  some value to indicate that the search was unsuccessful.

> A range, as I've said before, is something in between a
> container and an iterator. It has features of both, it also lacks
> features of both. We shouldn't necessarily design singularity into
> ranges (which is what seems to have happened at the moment), since as
> library writers, we are allowed to define whether a particular range
> has singularity or not.

On that point we agree.

>>> as well as not being a container.
>>>
>>>> It's unfortunate that the original implementation of iterator_range
>>>> set up different expectations for you, but AFAICT the only thing that
>>>> a Range has in common with a container is that it supplies a begin()
>>>> and end() that delimit a sequence of elements.
>>>
>>> And an empty() and a size()
>>
>> Okay, fine, but with a different (free function) syntax.
>
> empty() and size() on an iterator_range are both member functions - so
> I'm not sure what you mean here. Boost::iterator_range also provides
> front and back.

Yes, but now you've switched backfrom talking about the range concept
to talking about a specific model of that concept.

>> Right. Since you want to talk about what should drive the design of
>> iterator_range, I'll tell you what's relevant to this discussion:
>> iterator_range should represent a range built on iterators as its name
>> implies (see the link I provided) and should not make any promises that
>> force it to have space or time overhead beyond what a pair of iterators
>> costs.
>
> I disagree - what should drive the design of iterator_range is what
> people will find useful.

I think there are any number of designs people might find useful,
including ones that deviate substantially from either one under
discussion. I am saying that we have an unfortunate precedent to
contend with, so we may or may not be able to supply the ideal design
under that name, but if we were starting from scratch, a Boost component
called iterator_range that contains two iterators delimiting a range of
elements should meet peoples' expectations by following the design
principles of generic programming, and the extra bool simply clashes
with those principles.

> The link you provided uses a range - it doesn't define it. The
> library is free to define ranges to have the behaviour that is
> considered useful.

I disagree, and for the same reason I argued that the filesystem library
mustn't have a function called "base_name" that meant something
fundamentally different from Posix's base_name. In our domain, "range"
has an accepted meaning.

> In this case different people are asserting that different behaviour
> is useful - in your case you assert that time and space constraints
> are important, Tom has shown that valid empty ranges are useful.

I assert that both are useful.

> The most obvious solution to this is 2 versions of the iterator_range
> class.

Yes, if Tom needs the old behavior, he should have it. The question is,
how should he get it? He could write it himself, or Boost could provide
it in any number of ways.

I think if I were the Range library maintainer, I would feel responsible
enough for my initial design mistake to have made this transition a lot
easier for my users, and I'd be bending over backwards to make up for
the breakage. I would also be very reluctant to maintain the old design
as anything more than a deprecated feature, though, because --- in my
opinion --- it is fundamentally flawed.

That said, it's not up to me as long as we're going to respect
Thorsten's right to maintain his own library.

>> Of course, it may be too late for that if people have been relying on
>> the earlier behavior.
>>
>>> It is; however, what the library intends it to be.
>>
>> Hm?
>
> That assertion comes from the comment in the code that says
> iterator_range is container-like.

Apparently the "intentions of the library" are a bit like the shifting
sands.

>>> Finally, we can discuss detailed semantics of what an iterator_range
>>> should be all we like, the core issue here though is that
>>> iterator_range changed unannounced between 2 versions of boost.
>>
>> Well, that's certainly a separate issue, and a real problem, and one
>> that needs to be addressed somehow.
>
> Agreed.
>
>>
>>> Why can't we come to an agreement that 2 versions of iterator_range
>>> would make some sense, and move this discussion forward?
>>
>> Because I'm not yet convinced that's the way to go. Maybe the best
>> solution is a preprocessor switch that restores the old behavior.
>
> In that context you would be telling Tom that because he wants the old
> behaviour in some of his code, he would be banned from using the new
> behaviour elsewhere in his code? Is that your intention?

Whoa there, Mr. Prosecutor! I'm not intending to ban anything; I am
merely exploring the solution space.

Incidentally, Tom cannot have both behaviors under the same name
(including namespace) without either violating the ODR or introducing
still more runtime overhead, thread safety problems, and other global
state evil. If he is going to use both behaviors, I'd prefer he use the
new behavior under the existing name and use a different name elsewhere.
If that was indeed a viable option for him (though I expect it's not),
the workaround would be easy: provide his own old_iterator_range

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk