Boost logo

Boost :

From: George A. Heintzelman (georgeh_at_[hidden])
Date: 2002-09-05 16:19:27


Hi boosters,

I've been following this interval discussion with some interest. I
finally decided to see if I could add something to the discussion and
maybe clarify things for people watching who (like me until now)
haven't looked carefully at a few of the issues. I think the people
really active in this discussion probably understand what I'm going to
say, but I haven't seen it laid out.

> Doesn't the behavior we would like out of the standard library/STL force
> us to do that? Does "not less than" need to be deducible from < and ==?

Okay, I've seen a lot of people throw out 'what the STL implies'
without exactly what that is being laid out. So here I go. If I miss
something, I'm sure someone will (gently?) correct me.

There are three STL issues. One is use in std::find and similar algorithms/container member functions. For this purpose, the STL uses operator==. This suggests to me that we ought not to have operator== mean anything other than that both the begin and the end are ==. I think that will be the least surprising behavior; people who know they need to look for somehow overlapping intervals will be thinking about it and simply supply a new comparator for find_if. (The interval library could supply a few if they are found to be useful, of course).

The second STL issue is using std::sort, which uses operator< by default. If we want this to work, operator< must induce a strict weak ordering; that is to say, we have to be able to group elements in equivalence classes, then have a total ordering on the equivalence classes. There are two possible definitions that I see for this so far. One is, use lexicographic (which is itself a total ordering); but this isn't terribly helpful for many uses of intervals. The second is, sort according to the midpoint of the interval. This is somewhat more intuitive and may be more useful, but may pay a performance price (and imposes some requirements on the underlying 'number' type, though they are probably easily met for most use cases).

Frankly, I doubt many people are going to std::sort intervals without thinking carefully about what comparator to use (and again, the interval library supplying useful ones would be good); in many cases they will have other constraints on the intervals being sorted (e.g., non-overlapping) that will make the general-case question moot. (Both lexicographic and midpoint orderings give the same results for sets of non-overlapping intervals.) My own opinion on this definition is not made up, but I think I am leaning towards the midpoint definition; if you need pedal-to-the-metal speed, you use a comparator that does exactly what you want in your particular problem. The other alternative would be not to supply one at all (make operator< private, undefined to prevent user definition) and force the user to specify comparators if they were just not thinking about it.

The third STL issue is putting intervals in std::set/map. For this to behave intuitively, you need a total ordering and nothing less will do (strict weak will work only for multiset/map). Fortunately, the STL defaults to std::less<T>, not operator<. This suggests to me that, whatever we do with operator<, we ought to make std::less<interval<T> > use lexicographic ordering, and presumably specialize std::greater similarly. In the relatively common non-overlapping case, this is just a slightly inefficient (in terms of generated code) way of doing exactly what you'd want anyway.

Finally, I would like to say that I don't like the 'solution' of putting a comparator policy in the interval class itself. It fundamentally is not a property of the class, but of the operation being performed. You may well need one comparison in one place and a different comparison in another, on the same intervals. Rather, I think we should select a single (or no) semantic for the operators being overloaded, and certainly supply functions for all of the relationships laid out by Joel Young, plus probably some of the combinations of those.

I hope this adds some clarity to the debate rather than just throwing fuel on the fire.

George Heintzelman
georgeh_at_[hidden]


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk