 # Boost :

Subject: Re: [boost] [fixed_point] Request for interest in a binary fixed point library
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2012-04-15 16:07:44

Le 11/04/12 23:19, Vicente J. Botet Escriba a écrit :
> Le 11/04/12 20:23, Phil Endecott a écrit :
>> Hi Vicente,
>>
>> Vicente J. Botet Escriba wrote:
>>> the recent discussion on the MultiplePrecission Arithmetic library
>>> has show that some people has its ow fixed point library.
>>
>> You might like to review old discussions of fixed point, e.g.:
>>

>>
> Thanks for pointing to these threads. I will read them carefully.
Hi,

I have take a look at these threads and in particular to your fixed.h
file. Next follows some reactions

// Meaning of ++ and --
// --------------------
//
// It could be argued that the meaning of ++ and -- is unclear: do they add/subtract
// 1, or do they add/subtract the type's delta? (...there is an example of something
// where they do delta...). I note, however, that the built-in floating point types
// add and subtract 1, as does the proposed "Embedded C" fixed-point type.
// Therefore, ++ and -- are defined to do that for fixed and ufixed.

N3352 don't include them at all. From one side, I consider you are right and we should follow the builtin float semantics.
> From the other side, there is a precedent with chrono::duration that defines them increasing the representation.
While fixed_point is closer to the chrono design the goal of fixed_points is not to provide units but just another way to work with 'reals" so
increasing the delta is not the good solution.

++x;

and x+=1;

should be equivalent when both are provided.
Note that chrono doesn't allow the preceding operation. The user needs to specify what is been added, seconds, milliseconds, ...

I believe the best is to define the operator++ as if we did +=1.

// Fixed Point Constants
// ---------------------
//
// It would be ideal if code such as
//
// const ufixed<2,14> pi = 3.1415926;
//
// could have no run-time overhead. To be investigated...

The library could provide a constructor from ratio (conversion)

constexpr ufp<2,14> pi(ratio<31415926,10000000>);

While this is more verbose, it seams acceptable and there should not be run-time overhead.

// Overflow and Saturation
// -----------------------
//
// Users of fixed-point arithmetic sometimes want to detect overflow, or for
// operations to saturate when overflow would occur. However, users of
// floating-point and integer arithmetic also sometimes want these features, but they
// are not provided by the built-in C/C++ types. The approach taken in this
// implementation is to follow the behavior of the built-in types, i.e. to not
// handle overflow; values will wrap-around.

This is inline with the design of Chrono.
I will see how I can change my prototype so that overflow is managed by the underlying type. Does this mean that the library should provide
an integer class with an overflow template parameter? Been there, why not define a class with a range and overflow parameter.
The cardinal and integral classes defined in n3352 correspond quite closely to the needs.

// Division By Zero
// ----------------
//
// compare with int and float...
//
// embedded C fixed is undefined

What about just asserting when dividing by 0?

// Benchmarks
// ----------
//
// The following results are based on the benchmark programs in the ... directory.

Just a comment on your implementation. It seems that the implementation
expect the integral and fractional parts to be unsigned (see the use of
static_unsigned_max) even if the template parameters are signed (int).
This is way I don't like the i-f format, as it is not evident to say the
the number of integral or fractional bits is -3. I guess you don't use
it never this way.

template<int XWB, int XFB, typename X_VAL_T, int YWB, int YFB, typename Y_VAL_T>
pbe::fixed<boost::static_unsigned_max<XWB,YWB>::value, boost::static_unsigned_max<XFB,YFB>::value>
operator+(pbe::fixed<XWB,XFB,X_VAL_T> x, pbe::fixed<YWB,YFB,Y_VAL_T> y) {
typedef pbe::fixed<boost::static_unsigned_max<XWB,YWB>::value, boost::static_unsigned_max<XFB,YFB>::value> result_t;

>>
>> - There are difficult decisions to make about the result types of
>> arithmetic operations, analogous to this:
>>
>> int32_t a, b;
>> int64_t c;
>> c = a+b; // oops.
>>
>> It is tempting to make the return type of e.g. operator+ large enough
>> to accommodate the largest possible sum, and to truncate only when
>> the assignment occurs. But I think this will have an unavoidable
> Are you referring to the fact that a and b must be promoted before? or
> that the operation is done on the promoted type? Or maybe both?
>
> The promotion is not really needed always. For example
>
> fp<int64_t,32,0> a,b;
> fp<int64_t,33,0> c;
> c=a+b;
>
> No promotion needed in this case, but the operation is done on int64_t.
>
>> Some will argue that expression templates can be used to work around
>> this, but that dramatically increases the complexity of the library.
>>
> An alternative is to define functions that have the result type as
> template parameter, so if what you want is
> fp<int32_t,32,0> a,b;
> fp<int32_t,32,0> c;
> c=a+b;
>
>
> c = plus<fp<int32_t,32,0>>(a,b);
>
> here plus could be optimized and don't promote to fp<int64_t,33,0>. I
> have not implemented this yet, but I expect this to be as efficient as
> if we did
>
> int32_t a, b;
> int32_t c;
> c = a+b;
>
> in the case overflow is ignored. Of course, and implementation should
> be needed to probe the expectations.

After reading the post of M. Vitale here
related to the type of the arithmetic operations I think that there are
two different contexts:

C1- Arbitrary large fixed_points types (as used in N3352)

In this case the result is large enough to don't loss information so the
type can be deduced. Any overflow and rounding policy are good as no
applied. The liability is a loss of performance on the arithmetic operation.

Note that here the use of expression templates could improve the
performances if the result type needs several chunks.

C2- Bounded fixed-point types.

We can not in a general case avoid overflow checking during the
operation itself, but the rounding can be any.

I see two options:
O1 - expression templates that will apply the arithmetic operation with
the knowledge of the result type.

'implementation defined expression template' operator+(A, B);

E.g. if int64_t is available

ufp<63,0> a, b;
ufp<64,0> c,d;

c=a+b; // OK and possibly direct
c=d+b; // OK, returns ET and apply overflow policy of c
c=a+d; // OK, returns ET and apply overflow policy of c

While this works at first level things start to be more complex when
there are two operators

ufp<64,0> c,d,e,f;
c=d+e+f;

Should partial overflows be taken in account?

O2- don't define the operation in this case (compile fail) and let the
user specify the result type using a different function.

'fixed_point class deduced from A and B operator+(A, B); // when the
result is representable without overflow.

template <typename RT, typename A, typename B>

E.g. if int64_t is available

ufp<63,0> a, b;
ufp<64,0> c;

c=a+b; // OK
c=c+b; // compile fail
c=a+c; // compile fail
c=c+b; // compile fail
c=add<ufp<64,0,OV>>(a,c); // Ok, applying the OV policy.

My particular preference is for O2.

I don't see how a single class could take in account the context C1 and
C2 without adding a new template parameter, as in some domains the
fixed_point class must be bounded, while in others it is more important
the precision than the efficiency.

I would say that we need two family classes
* fixed but arbitrary wide (N3352).
* fixed and bounded by the widest integer builtin type or even by the
user (something like my prototype).
>
>> - There are also difficult decisions to make about implicit and
>> explicit conversions.
> Are you referring to the conversion between fixed-point types with
> different range or resolution? Yes, this is a conflicting point and
> has a deep impact on the design of the library. I don't know if we can
> manage by providing both.
>
> My prototype as well as the C++ proposal allows implicit conversions
> when there is no loss of data and requires some kind of fixed point
> cast if information could be loss.
>
> {
> unsigned_number<2,-3> n1((index(1)));
> unsigned_number<2,-2,round::negative> n2;
> n2=number_cast<unsigned_number<2,-2,round::negative> >(n1);
> }
>
> An explicit convert function could also be used to avoid repeating the
> target type
>
> convert_to(n1, n2);
>
Do you have any comment on this implicit/explicit conversion approach?

// fixed<15,16> f1, f2; // Signed fixed-point values with 15 bits of integer part
// // and 16 bits of fractional part.
// f1 = 1.2345; // Conversion from floating point.
// f2 = f1 - 2; // Mixed arithmetic with integers.
// f2 = f1 / f2; // Arithmetic on fixed-point values.

This should need a explicit conversion with my approach, as we reduce the range and/or the resolution.

f1 = fixed<15,16>(1.2345);
f2 = fixed<15,16>(f1 - 2);
f2 = fixed<15,16>(f1 / f2);

or

f2 = substract<fixed<15,16> >(f1,2);
f2 = divide<fixed<15,16> >(f1,f2);

It is clear from these examples that safety has a cost.

Would this syntax be convenient? Or, do we need something intermediary as
f1 = convert(1.2345);
f2 = convert(f1 - 2);
f2 = convert(f1 / f2);

The template function convert will return a wrapper of its parameter
that is accepted as an implicit conversion.

template <typename T>
struct convert_tag;
template <typename T>
convert_tag<T> convert(T const&);

fixed(convert_tag<float>); // implicit
template <int I,int F>
fixed(convert_tag<fixed<I,F>>); // implicit

My prototype contains already an index function and a index_tag<T>
template that is used to construct a fixed point from a valid and
unchecked representation.

Best,
Vicente