Boost logo

Boost :

From: Gennadiy E. Rozental (rogeeff_at_[hidden])
Date: 2001-11-02 14:34:20


Hi

I am woring on improving floating point equality testing. I have
couple issues I would like to discuss.

1. Comparison algorithm
   There are several choises while implementing comparison algorithm
for 2 floats f1 and f2 ans tolerance (epsilon) e.

   a. Currently used algorithm
      |f1-f2| < e
    Most of us know that algorithm causing an errors in many cases.
   
   b. Knuth's algorithms
      |f1-f2| <= e*|f1| && |f1-f2| <= e*|f2| => f1(=)f2 (1)
    Here sign (=) means "almost equal".

      |f1-f2| <= e*|f1| || |f1-f2| <= e*|f2| => f1~f2 (2)
    Here sign ~ means "close enough".

   There is a question what is the best way to implement these
comparisons. (1) could be rewritten following way:
      |1-f2/f1| <= e && |1-f1/f2| <= e => f1(=)f2 (1`)
   While implementing (1) we will need to check for underflows in
right side of comparisons. While implementing (1`) we will need to
check for underflows and overflows in left side of comparisons. But,
even (1) seems to be easier to implement, it seems that it could give
the wrong answer in some case. For example, for f1=1e-10, f2=1e-10
ans e small enough we will have underflow in left side. Any
comparizon with underflow will fail. So we will get f1 != f2. Oops...
(1`) seems to be more safe.
   If f1(=)f2 => f1~f2. So (1) is more strong comparison. Which one
is more usefull is not obvious. We could probably support both.
   Also f1(=)f2 => f2(=)f1 and f1~f2 => f2~f1. So both comparisons
are transitive.
   
   c. Alberto Squassabia algorithm presented in C++ Report 2/2000.

      1-e < f1/f2 && f1/f2 < 1 - e => f1(=3)f2
          (3)
    Here sign (=3) means "f1 close to f2 according to comparison 3".
I intentionally omit underflow and overflow checks to simplify
undestanding.

   (3) => (2). But (3) !=> (1) and (2) !=> (3). So (3) stays
somewhere in the middle between (1) and (2). Also (3) in not
transitive so: f1(=3)f2 !=> f2(=3)f1.
   (3) is cheaper than either (1) or (2) but it unclear why should we
stick to nontransitive logic.

   d. Pete Becker algorithm presented in CUJ 12/2000

     |(f1-f2)/(f2 == 0 ? 1.0 : f2 )| < e (4)
     This is basicly the same as (3) without underflow/overflow
checks, what can cause some unexpected errors. For example for very
big f1 and f2 that are almost equal to each other. (4) has the same
property as (3).
   
2. Choosing of tolerance (epsilon) value.
   Again several choises:

   a. User-provided value.
    This is most safe way. If comparison fails, user know who should
be blamed for this (:-)). But it's not the most convinient. In some
cases user do not know how to choose the tolerance properly and just
want to compare whether or not his values are equal from computer
stand point.

   b. std::numeris_limits::epsilon()
     (How portable is numeric_limits?). We can compare 2 values
against std::numeris_limits::epsilon(). From definition of epsilon()
it's seems to be good choise. There is small catch: if value compared
are results of several arithmetic operation we expected rounding
errors whould be greater. For example, from Knuth:
   
     Rounding error expected from f1+f2 (or any arithmetic operation)
is 1/2*ULP. (ULP is the same as epsilon())
     Rounding error expected from f1*f2*f3 is 2*ULP/((1-ULP/2)^2) ~
2*ULP. It will be easy to prove that number of rounding errors in
result of n arithmetic operatin with floating point numbers is
O(n^2/2* ULP). So we have 2 more choises.

   c. NROUND * std::numeris_limits::epsilon()
      Here NROUND is expected number of rounding errors provided by
the user. It's formula presented in Pete Becker article.

   d. N^2/2 * std::numeris_limits::epsilon()
      Here N is number of floating point arithmetic operaton these
values are participated in provided by the user. In some cases we
won't be able to use this formula because values are participated in
non arithmetic operations. For example, sin.

3. Types of the argument
   Currently Boost Test Tools provide version that allow comparison
of double values only. It's obviously not enough. We want t obe able
to compare values of any floating point types, including user-
defined. So we will need to use template argument T. There is still
question about comparing values of different types, like float and
long double. Should we rely on compiler type conversion? What if I
want to specify what type should be used for comparison? Be aware
that real comparison function will be hidden by helper macro so you
won't be able to explicitly specify the type. Though user can
explicitly cast arguments to an appropriate type. Will it work?

Now, here is scheme for proposed version. What do you think:

// safe f1/f2
template<typename FPT>
FPT
safe_fp_division( FPT f1, FPT f2 ) {
        return (f2 < 1 && f1 > f2 * std::numeric_limits<FPT>::max
()) ? std::numeric_limits<FPT>::max() :
           ((f2 > 1 && f1 < f2 * std::numeric_limits<FPT>::min()) ?
std::numeric_limits<FPT>::min() :
                                                                
                                                                
        f1/f2 ));
}

template<typename FPT>
check_fp_equal( FPT f1, FPT f2,
                FPT epsilon = std::numeric_limits<FPT>::epsilon(),
                bool strong_or_weak = true ) {
    FPT d1 = 1 - safe_fp_division( f1, f2 );
    FPT d2 = 1 - safe_fp_division( f2, f1 );

    return strong_or_weak
         ?
                          std::fabs( d1 ) && std::fabs( d2 ) :
                          std::fabs( d1 ) || std::fabs( d2 );
}

template<typename FPT>
check_fp_equal( FPT f1, FPT f2,
                int number_of_arith_oper = 1,
                bool strong_or_weak = true ) {
   return check_fp_equal<FPT>( f1, f2,
                        number_of_arith_oper * number_of_arith_oper /
2 * std::numeric_limits<FPT>::epsilon(),
                        strong_or_weak );
}

For cases 2.a, 2.b and 2.c user will use first version of
check_fp_equal, for 2.d - second.

Plus helper macroses

#define BOOST_CHECK_FLOATING_POINT_EQUAL( f1, f2, epsilon_hint ) ...
or
#define BOOST_CHECK_REAL( f1, f2, epsilon_hint ) ...

and

#define BOOST_CHECK_FLOATING_POINT_WEAK_EQUAL( f1, f2,
epsilon_hint ) ...
or
#define BOOST_CHECK_REAL_WEAK( f1, f2, epsilon_hint ) ...

What do you think about proposed solution?

Regards,

Gennadiy.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk