
Boost : 
Subject: Re: [boost] [fixed_point] First presentation from GSoC 2015 and more
From: John McFarlane (john_at_[hidden])
Date: 20161014 13:11:56
>
> Date: Thu, 13 Oct 2016 20:41:27 +0000 (UTC)
> From: Christopher Kormanyos <e_float_at_[hidden]>
>
> > I discuss P0106R0 (a revision to N3352) in my paper, the latest draft of
> > which is always available here: https://github.com/
> > johnmcfarlane/fixed_point/blob/master/doc/p0037.md#p0106
> > I'm honestly not sure which approach is preferable overall. The set of
> > features and qualities requested during SG14 and SG6 meetings is large
> and
> > conflicting. It includes minimal build times, minimal runtime overhead,
> > every rounding and overflow strategy listed in P0105, decimal radix and
> > support for specialized fixedpoint instructions on embedded
> architectures.
> Hi John. Thanks for visiting this thread.
>
> My recommendation would be to limit the scope of the proposal
>
to a realistic level in the SGs so that we can end up with anything
>
at all within a reasonable time scale. We could consider back porting
>
some (but probably not all) of SG14's specified interface to Boost
>
if that helps move anything along.
>
A second pair of implementor eyes would be a huge help.
I should point out that although it hasn't seen much attention lately,
P0106 may still be active. (Frankly, the evolution of my design has been
largely a process of slowly copying or facilitating all the most popular
from Lawrence's proposal!)
One problem P0037 faces  even if it were to be standardized  would be
that alone, it only provides fixedpoint arithmetic. It's designed to work
with types that also solve arbitrary width, rounding and overflow but those
types are not part of any proposal. In particular, without an
arbitrarywidth integer type, it's very difficult to use fixed_point<>
safely.
>
> I was also wondering about decimal radix, but I don't know if it
>
might simply be better to identify those 80% of the most popular
use cases and simply specify these. At least it would get the library
>
moving forward and people could use fixedpoint in C++. This is
basically what we tried to do with the GSoC project.
>
That makes sense. I at least want to have a solution which doesn't close
the door on decimal in future additions. Here's my current plan:
https://github.com/johnmcfarlane/fixed_point/blob/master/doc/p0037.md#nonbinaryradixes
>
> >> In your work I've been experiencing some trouble with>> the division
> routine.
> > Can you elaborate? Was this recently? I've been trying out numerous
> > strategies for arithmetic operators. My current preferred approach is to
> > pad the lefthand operand before performing the division but that's only
> > been checked in for a few months.
>
> I'm not sure about the design, and error on my side is
a likely possibility. But division operator in the master
branch seems like it does not pad before divide.
This might give the wrong result.
>
I hope that's not the case but it's a tough operator to get right 
especially with 32bit operands. If you ever find a good example of this
happening I'd be very interested in it. At least when using the elastic
alias, precision loss should be avoidable (beyond inevitable approximation
of rational numbers and x/0).
tl;dr: The details of padding in the division operator follow...
The padding is done on this line:
https://github.com/johnmcfarlane/fixed_point/blob/91c3f2e272f9fd5e33ffb4dc2db141b41884186d/include/sg14/bits/fixed_point_named.h#L231
It's on display in this test:
https://github.com/johnmcfarlane/fixed_point/blob/1cc7ceaaff5422667ae962995aec369ae3f85dd2/src/test/fixed_point_common.h#L735
In the above test, the two operands have a single fractional digits. If
operator/ did not pad the LHS, then the integer operation would be 126/8
which would lose precision. Instead, the LHS is promoted and shifted left
by 8 bits. Then, integer division occurs and the value is stored with
(hopefully!) no further shifts.
> For division, I have used the following approach.
* handle sign
* extend n bits to 2n bits
> * shift left by radix split plus one round bit
* divide 2n / n > n bits
> * handle round and sign
> So you're right this requires 64bit extension for a 32bit fixedpoint
unless you tackle the problem of twocomponent
division, which I have done using a simplified
version of Knuth long division. At the moment,
this is only in develop branch. The compiler switch
is BOOST_FIXED_POINT_DISABLE_WIDE_INTEGER_MATH.
>
This is core to the difference in approach between P0037 and P0106. I
guarantee to use builtin integer math for any type specialized on builtin
integers. I believe that types with arbitrary width can then be built upon
this foundation using novel integer types.
Here is an example using Boost.Multiprecision to express a Googol:
http://johnmcfarlane.github.io/fixed_point/#boost
fixed_point<> uses the same algorithm regardless of whether the `Rep`
integer is an uint8_t or a 400bit custom type.
>
> >> I also had some problems compiling
on VC12 (but you don't target that
> >> compiler anyway). So that's no big deal.
>
> > You will need a C++11compliant compiler as sg14::fixed_point<> creates
> > literal types. Lack of support for features like constexpr constructors
> is
> > the main obstacle to compiling under VC++ 2013 and earlier.
> I actually agree with that approach. Either it uses C++11 or
it does not. I don't see the need to cater to a specific older
compiler for new library work.
>
It's a pragmatic choice. Three reasons are:
* C++11 has everything I need to implement (with difficulty) a literal
type. In particular, I can write most of my unit tests using
`static_assert` which is the best thing ever!
* Once you start returning novel specializations from your arithmetic
operators, `auto` type deduction is essential to usability.
* Many of my target users in SG14 are stuck with compilers which are behind
the curve.
> >> In your work you are using wrapped versions of elementary transcendental
> >> functions designed for builtin float. You might appreciate our
> handcrafted
> >> functions designed to be highly efficient in several digit ranges.
>
> > Yes! I would be very interested in seeing if I could adapt those
> function.
> We worked hard on these. And there are many good
algorithms available in
boost/fixed_point/fixed_point_negatable_cmath.hpp
>
> Having spent a nontrivial amount of time writing the simplest sqrt I
could find, I appreciate the effort this must have involved.
> Mine are placeholders and I've deliberately postponed the nontrivial task
> > of adding cmath equivalents. It's come up  maybe a couple of times 
> > whether these functions are desired and the response has been to just
> keep
> > the proposal narrow so they are not included.
>
> > There are interesting questions regarding how to implement some
> functions.
> > For example, you surely don't always want the same inputs and outputs
> from
> > functions involving angles. And hopefully these functions can be
> constexpr
> >  unlike existing functions.
> I agree.
> > Also, how have you approached the tradeoff between speed and accuracy? I
> > went for a very simple sqrt function which is adequate if the value is
> > calculated at compile time but is likely not desirable at runtime. This
> > isn't a problem that floatingpoint variants face.
>
> The elementary transcendental functions use many techniques
including polynomial approximation, Pade approximation, Taylor series,
> Newton iteration. For low digit counts, mostly polynomial approximation. We
> express the coefficients directly as integral values of the fixed_point
> representation and avoid as much as possible full division and multiply,
> favoring integer div and mul. These functions are quite efficient. We have
> not constexpred them. But it looks like there is potential for that.
>
Writing constexpr functions in C++11 can be a big challenge.
>
> >> I have also benched our fixed_point on baremetal embedded systems both
> >> 8bit and 32bit with very satisfying efficiency results. This is a key
> >> feature for our embedded users that might also be interesting for SG14.
>
> > My current aim is to match integer performance for common arithmetic
> > operations when using builtins. This leads to some situations where
> > overflow is a big problem  but no worse than dealing with raw integers.
> > I'm curious how you deal with multiplying two intsized values together
> > without resorting to wider, slower types.I was under the impression that
> > P0106 types always widen to fit their values.
> We have resorted to wider integer for internal mul and div or used higher
> level algorithms when the compiler switch BOOST_FIXED_POINT_DISABLE_WIDE_INTEGER_MATH
> is activated as mentioned above (develop branch only).
> In our approach, full multiplication requires:
* sign conversion
>
* left shift for round bit
> * multiply n*n > 2n bits
* right shift and round
* sign conversion
> On a 32bit system, a full 32*32 > 64 mul is required.
> This full multiplication approach can not achieve your desired efficiency
> goal on 8bit and 32bitCPUs. Division uses similar approach.
>
> Multiplication and division with plain int or unsigned can achieve the
> efficiency goal  if it were not for handling rounding and overflow.
>
> In my current branch, I'm actually trying to get even closer to
integerlevel efficiency. No longer does 32bit multiplication result in
64bit results on systems with 32bit integer. This means that overflow on
32bit arithmetic operations is even more common. The solution is explicit
casting to 64bit or use of an integer type which autowidens:
http://johnmcfarlane.github.io/fixed_point/#elastic
This is the toughest design issue I face  especially when combined with
division.
>
> >> Thank you and best regards, Chris
> > And thank you. Keep up the good work!
> > John
> Thanks again and best regards, Chris
>
> Cheers,
John
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk