From: David Bergman (davidb_at_[hidden])
Date: 2002-09-05 16:35:42
How would you define the mid-point relation for elements having
identical midpoints, but varying extension, such as [1, 5] and [2, 4]?
Since we want a total ordering, and not be forced (through the
"equivalent if incomparable" hypothesis of the STL library) to have them
[mailto:boost-bounces_at_[hidden]] On Behalf Of George A.
Sent: Thursday, September 05, 2002 5:19 PM
Subject: Re: [boost] Interval Library and comparison operators
I've been following this interval discussion with some interest. I
finally decided to see if I could add something to the discussion and
maybe clarify things for people watching who (like me until now)
haven't looked carefully at a few of the issues. I think the people
really active in this discussion probably understand what I'm going to
say, but I haven't seen it laid out.
> Doesn't the behavior we would like out of the standard library/STL
> force us to do that? Does "not less than" need to be deducible from <
> and ==?
Okay, I've seen a lot of people throw out 'what the STL implies'
without exactly what that is being laid out. So here I go. If I miss
something, I'm sure someone will (gently?) correct me.
There are three STL issues. One is use in std::find and similar
algorithms/container member functions. For this purpose, the STL uses
operator==. This suggests to me that we ought not to have operator==
mean anything other than that both the begin and the end are ==. I think
that will be the least surprising behavior; people who know they need to
look for somehow overlapping intervals will be thinking about it and
simply supply a new comparator for find_if. (The interval library could
supply a few if they are found to be useful, of course).
The second STL issue is using std::sort, which uses operator< by
default. If we want this to work, operator< must induce a strict weak
ordering; that is to say, we have to be able to group elements in
equivalence classes, then have a total ordering on the equivalence
classes. There are two possible definitions that I see for this so far.
One is, use lexicographic (which is itself a total ordering); but this
isn't terribly helpful for many uses of intervals. The second is, sort
according to the midpoint of the interval. This is somewhat more
intuitive and may be more useful, but may pay a performance price (and
imposes some requirements on the underlying 'number' type, though they
are probably easily met for most use cases).
Frankly, I doubt many people are going to std::sort intervals without
thinking carefully about what comparator to use (and again, the interval
library supplying useful ones would be good); in many cases they will
have other constraints on the intervals being sorted (e.g.,
non-overlapping) that will make the general-case question moot. (Both
lexicographic and midpoint orderings give the same results for sets of
non-overlapping intervals.) My own opinion on this definition is not
made up, but I think I am leaning towards the midpoint definition; if
you need pedal-to-the-metal speed, you use a comparator that does
exactly what you want in your particular problem. The other alternative
would be not to supply one at all (make operator< private, undefined to
prevent user definition) and force the user to specify comparators if
they were just not thinking about it.
The third STL issue is putting intervals in std::set/map. For this to
behave intuitively, you need a total ordering and nothing less will do
(strict weak will work only for multiset/map). Fortunately, the STL
defaults to std::less<T>, not operator<. This suggests to me that,
whatever we do with operator<, we ought to make std::less<interval<T> >
use lexicographic ordering, and presumably specialize std::greater
similarly. In the relatively common non-overlapping case, this is just a
slightly inefficient (in terms of generated code) way of doing exactly
what you'd want anyway.
Finally, I would like to say that I don't like the 'solution' of putting
a comparator policy in the interval class itself. It fundamentally is
not a property of the class, but of the operation being performed. You
may well need one comparison in one place and a different comparison in
another, on the same intervals. Rather, I think we should select a
single (or no) semantic for the operators being overloaded, and
certainly supply functions for all of the relationships laid out by Joel
Young, plus probably some of the combinations of those.
I hope this adds some clarity to the debate rather than just throwing
fuel on the fire.
Unsubscribe & other changes:
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk