Boost logo

Boost :

From: Giovanni Piero Deretta (gpderetta_at_[hidden])
Date: 2008-02-26 10:24:02

On Tue, Feb 26, 2008 at 3:04 PM, John Maddock <john_at_[hidden]> wrote:
> John Maddock wrote:
> > 1) Floating point classification routines: these are optimised
> > implementations of the C99 and C++ TR1 functions fpclassify, isinf,
> > isnan, isnormal and isfinite. From Boost-1.35 onwards these are
> > already a part of Boost.Math (see
> >
> > so if accepted the two implementations will get merged.
> >
> > The review here should focus on the implementation used, and testing
> > on whatever platforms you have available - in particular are there any
> > circumstances (compiler optimisation settings etc) where this
> > implementation breaks?
> These look fine to me, and as Johan mentioned, any breakages should be fixed
> in endian.hpp.
> Sorry to bring this up again (!), but I've been trying to get my head around
> the aliasing rules and what is and isn't allowed, in particular by following
> the rather useful article here:

also see:

> >From this it appears that your current code is perfectly fine, however I
> still can't help feeling that we could do better than call an external
> function (memcpy) to move 4 or 8 bytes. The suggestion in the article is to
> use a union to move from one type to another:
> union float_to_int32
> {
> float f;
> boost::uint32_t i;
> };
> boost::uint32_t get_bits(float f)
> {
> float_to_int32 u;
> u.f = f;
> return u.i;
> }
> I can't claim to have tested exhaustively, but modifying get_bits and
> set_bits to use this method seems to work OK for VC8&9 and Minwg GCC.

GCC 4.2 under x86_64 produces better code with std::memcpy (which is
treated as an intrinsic)
than with the union trick (compiled with -O3):

  uint32_t get_bits(float f) {
      float_to_int32 u;
      u.f = f;
      return u.i;


        movss %xmm0, -4(%rsp)
        movl -4(%rsp), %edx
        movl %edx, %eax

Which has an useless "movl %edx, %eax". I think that using the union
confuses the optimizer.
This instead:

  uint32_t get_bits2(float f) {
    uint32_t ret;
    std::memcpy(&ret, &f, sizeof(f));
    return ret;


        movss %xmm0, -4(%rsp)
        movl -4(%rsp), %eax

Which should be optimal (IIRC you can't move from an xmms register to
an integer register without passing through memory).

Note that the (illegal) code:

  uint32_t get_bits3(float f) {
    uint32_t ret = *reinterpret_cast<uint32_t*>(&f);
    return ret;

Generates the same code as get_bits2 if compiled with
-fno-strict-aliasing. Without that flag it miscompiles (rightly) the
code. I've tested the code under plain x86 and there is no difference
between all 3 functions.

So the standard compliant code is also optimal, at least with recent GCCs.


Boost list run by bdawes at, gregod at, cpdaniel at, john at