Boost logo

Boost :

From: Guillaume Melquiond (gmelquio_at_[hidden])
Date: 2002-09-03 13:50:26


On Tue, 3 Sep 2002, Douglas Gregor wrote:

> Here is my review of the interval library. Thus far I've read through the
> documentation, skimmed the source code, and tried a few examples with an eye
> toward using the library for static analysis (involving intervals over
> integers and symbolic expressions). The sticking point for me with this
> library is the handling of interval comparisons. I'll start with those:

The handling of interval comparisons was also a point that was bothering
us.

> [Comparisons]
> * The 'bool' return value from comparisons doesn't give enough information,
> because 3 states {false, indeterminate, true} are mapped to 2 states {false,
> true} before the comparison returns. Thus to see if a comparison is
> indetermine one needs (based on compare_certainly):
>
> interval<T> i, j; // get i, j from somewhere
> if (!(i < j) && !(i >= j)) {
> // ...
> }
>
> My opinion is that mapping to bool from any larger state-space should be
> done as late as possible, so that information isn't lost prematurely.

Yes, it's something that can be done. And it will probably be done. It's
only a matter of changing the return type of the comparison operators and
the impact on the interface is quite minimal.

However, the version being reviewed will not be changed. Only the CVS will
see the changes.

> * I'm not a big fan of the comparison policy. Policies are great for code
> reuse, but when they are used to decide between different syntactic forms for
> equivalent semantics things can get _very_ confusing. This code is quite
> ambiguous:
>
> template<typename T, typename Traits>
> void foo(interval<T, Traits> x, interval<T, Traits> y)
> {
> if (x < y) {
> // ...
> }
> }
>
> Checking 'x < y' has no well-defined semantics, because 'foo' has no way to
> know what the comparison policy is. This forces all generic code to use
> 'cerlt' or 'poslt' instead to give reasonable semantics. Essentially, any
> benefit of the '<' syntax is lost because one needs full contextual
> information to determine the semantics of the operation.

Yes, comparison operators are a real problem as I said before (and it also
appears in the documentation). Let's the example of some other interval
library. I just hope I'm not mistaken.

Sun C++ Interval Library: the comparison operators are not defined and the
user has to use explicit comparison functions.

Profil/BIAS: the comparison operators are defined but are the inclusion
operators (subset, proper subset, superset, etc).

MPFI: the library is written in C so there is no operator overload
available. But it is stated that "Warning: the meaning of interval
comparison is not clearly defined" and the user has to provide customized
comparison functions.

CGAL: the comparison operators are defined, but an exception is thrown
each time the intervals overlap.

They are not the only libraries; but they already are used by a lot of
people. And I didn't see any satisfactory interface for the comparison
operators. And I would have been alone to write this library, I wouldn't
have defined them (and please remember, it was before the discussions on
tribool). So I'm really not surprised you think "'x < y' has no
well-defined semantics".

> * I don't understand the reasoning behind the definition of <=. The
> documentation says "...another misleading interpretation of the comparison
> is: (still with the example of compare_certainly) [a, b] < [c, d] is true if
> and only if it is true for each pair of element in [a, b] and [c, d]. But
> [a, b] <=[c, d] is true if and only if it is true for at least one pair."
> Why choose this (to me, non-obvious) semantics? Given a comparison such as
> [3,5] <= [4,6], why would we consider this true? I view an interval
> comparison [3,5] <= [4,6] as the comparison x <= y where x is in [3,5] and y
> is in [4,6], so the result is indeterminate (x could be 5 when y is 4, but x
> could also be 3 when y is 6, so either true or false is possible). What
> interpretation of [3,5] <= [4,6] leads to a value of (certainly) true?

You are right, there is two interpretation of <=. And it was also one of
the conflicting point during the design phase of the library.

First, you can say that [a,b] <= [c,d] is true iff it is true for each
pair of elements.

But you can also say that [a,b] <= [c,d] is the same as !([a,b] >
[c,d]). And unfortunately, this interpretation is not equivalent to the
previous one.

So we had to choose an interpretation. And we finally choosed the second
one. The reason is: an user who is not attentive enough may transform "if
!(a <= b)" in "if (a > b)". If it was the first interpretation, the
program would suddenly be incorrect. It can even be more dangerous if you
are using an algorithm that was not written by yourself: how can you be
sure the programmer didn't do this mistake, and this error never appeared
before because nobody was using intervals with it.

> * The <= semantics from above don't hold for cerle. Consider this fragment:
>
> interval<int> i(3, 5);
> interval<int> j(4, 6);
> assert(i <= j); // okay, based on above semantics
> assert(cerle(i, j)); // fails! the cerle semantics are different from <=?
>
> I would expect the semantics of <= with compare_certainly to coincide with
> the semantics of cerle, but they don't. (I prefer the cerle semantics)

The reason is in my answer to your previous point:

- cerle(x,y) is true iff for each pair we have <=.
- x <= y is !(x > y).

> * Comparisons require the underlying type 'T' to be a total order, but I
> don't see the logical basis for this requirement because a partial order will
> suffice (intervals are a partial order regardless). I realize I'm pushing the
> interval library out of its intended domain, but the library is very close to
> usable more abstract domains (e.g., intervals bounded by symbolic
> expressions).

As I said in another mail, I agree that a partial order can be enough if
it is compatible with the arithmetic operations. But please remember that
if you need to use the multiplication or the division, you need to know if
the bounds are positive or negative.

> [Comparisons - potential solution]
> My proposed solution is:
> * Use a 3-state boolean type for the result of comparisons between
> intervals. The 'tribool' type I posted a few weeks ago is one such type that
> I believe would give a reasonable semantics.

Yes, the library will be modified to allow the return type of the
comparison operators to be chosen by the user.

> * Remove comparison policies entirely. With an accepted 3-state boolean
> type, there will be one semantics for 3-state boolean values that users will
> need to learn, so an interval comparison 'i < j' has a single, well-defined
> semantics in terms of 3-state boolean values and its narrowing to a standard
> 'bool' will be consistent.

Entirely remove the comparison policy? You want the 3-state boolean type
to be fixed by the interval library? :-)

> * Remove cer?? and pos?? functions. 3-state boolean values can be queried
> directly to determine if they are false, indeterminate, or true. Without
> comparison policies, conversions to 'bool' are predictable.

I'm against the removal of the explicit comparison functions. Please
understand that such a function only needs one number comparison to know
what to answer. A comparison operator returning a tristate boolean may
need two numbers comparison before answering (and depending on the
compiler, maybe one more to convert it to bool). So they are really
fastest.

In certain situations, it can even add lisibility to the program. For
example, how will you write poslt(x,y)? I'm not sure your answer will be
more readable.

> * Reformulate comparisons & routines that use comparisons to allow the
> underyling type 'T' to model only a partial order. For instance, the
> 'intersect' routine requires a total order:
>
> template<class T, class Traits> inline
> interval<T, Traits> intersect(const interval<T, Traits>& x,
> const interval<T, Traits>& y)
> {
> using std::min;
> using std::max;
> if (interval_lib::detail::test_input(x, y))
> return interval<T, Traits>::empty();
> const T& l = max(x.lower(), y.lower());
> const T& u = min(x.upper(), y.upper());
> if (l <= u) return interval<T, Traits>(l, u, true);
> else return interval<T, Traits>::empty();
> }
>
> If 'T' is only a partial order, the last two lines need to change to take into
> account the possibility that 'l <= u' cannot be determined. So, for instance,
> if the 3-state boolean type evaluates true only when its value is true (like
> compare_certainly - tribool acts this way), the last two lines would become:
>
> if (l > u) return interval<T, Traits>::empty();
> else return interval<T, Traits>(l, u, true);
>
> The difference is that we return an empty interval only when we are _sure_
> that the interval is empty.

Is it really better to return an empty interval only when you are sure it
is empty? I personnally prefer to return a valid interval only when it is
really valid.

It is the reason why it was written (l <= u): it was already meant to
handle the case of a partial order (since the order on the floating-point
numbers is only a partial order, cf NaNs) and to return an empty interval
when the bounds aren't ordered.

Regards,

Guillaume


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk