Boost logo

Boost :

From: Isaac Dupree (isaacdupree_at_[hidden])
Date: 2008-08-01 20:31:13


Is there interested in a library to check for overflow in (primarily)
the built-in integers? (and some other things related to int-wrapping:
see below).
It implements a different thing than constrained_value, though I suppose
they might have some code in common.

I need/want it in my own project, so I've been working on such a thing:

Use case: I'm trying to write some performance-critical code that might
buggily rely on larger ints than are provided, and it's not obvious whether
there are actually any integer overflows happening (I hope not, but I need
to test -- there are some bugs that might be overflow-related).
Additionally, it has to behave exactly the same on the network across
different computers it's compiled on (so I've been using all specific
int-sizes,
such as int32_t).
(okay, it's a multiplayer game where all players simulate on their own
computers to keep in sync, and most of the time only transfer their
very simple commands over the network.)

For example, a bug I'd had that took me a while to squash involved not
realizing that right/left-shifting by an unsigned integer's bit-size is
undefined behavior, (the bug manifested on x86-linux-gcc but not
powerpc-linux-gcc). Also, the platforms manifested different behaviors
for dividing by zero (crash with SIGFPE(!) versus continue with some value).

Most of the time I want overflows to be treated as errors, (although
sometimes I actually want the unsigned modulo-behavior, and a plausible
error behavior would be to clamp the value to within the range and log to
stderr). I looked at a (proposed for boost?) constrained-value library
http://student.agh.edu.pl/~kawulak/constrained_value/index.html
. Since I want the full range of the builtin types, and constrained_value
deliberately doesn't check for overflow in the underlying representation,
I need something different (or additional).

Ideally it could be a drop-in replacement for the builtin types I've
been using. Like constrained_value, I decided it was too unsafe to
have implicit casting back to the representation-type.

The integral-promotion effects inherited from C would surely end up
being changed by such a wrapper. So I also want a wrapper that just
"sanitizes" those promotions: I think e.g. adding ints of the same
signedness should produce a value with the max. of their two bit-sizes,
and anything else (including mixed-signedness) require explicit casting.
Then I can wrap that type with my overflow-checking template (which
has performance overhead) for testing, while having exactly the same
behavior as without the overflow-checking when there are in fact no
overflows.

There's also the issue of determinism in relation to implementation-
defined behavior -- e.g. dividing with or right-shifting negative numbers.
In one wrapper I probably want to (optionally, depending on policy
template arguments perhaps) prevent doing those operations on the
underlying types, and perhaps have another wrapper that detects those
cases and replaces them with builtin operations that work.
(e.g. (sint >> shift) becomes (sint >= 0 ? (sint >> shift) : ~(~sint >>
shift))
(I started out not being so wrapper-happy, but then I thought it was
probably cleaner to have a library that makes it generally easy to wrap
int-like things.)
Hopefully a compiler that doesn't need them, can just optimize out the
pessimizations.

Other notes about implementation "decisions" I've thought about:

It's very convenient to assuming that the underlying values are two's
complement bitwise (signed or unsigned) integer types that have some
number of bits, for writing algorithms to efficiently check for overflow,
so I do that in the part of the library that has range-checking algorithms
for ints. However, I allow underlying types to be larger than strictly
needed (and in some cases the underlying type being larger, allows
the overflow-checking code to be faster as well).

Currently I have the overflow-checking algorithms in macros, which I
don't like much. But it has some advantages:
- library can even be used in preprocessor constant expressions, e.g.
      in an #if. (not worth as much as it could be because it can't interact
      with Boost.Preprocessor constants usefully.)
- Until we have C++0x "constexpr" functions, the only way to share
      int-based algorithms between runtime code and compile-time
      template code is macros.
- The maximum bit-sizes of the values could be allowed to vary at runtime.
      But then we lose a lot of static optimizations. Putting the
bit-sizes as
      dynamic arguments even to an "inline" function template seems risky,
      but making them static is forever. In light of constrained_value's
      mutable ranges, I don't think there's actually a need for that.
Bit-shift
      amounts are often but not always known at compile-time... but let's
      let the compiler worry about that?
But there's a disadvantage:
- No explicit sharing; the compiler will have to figure out that two
      expressions are identical. However, explicit sharing kills
'constexpr'
      functions too... but at least then the shared expression can be
      put into another named function so that it might be less work
      for the compiler to notice that they're the same.

Overflow checking for signed numbers is annoying/inefficient,
because e.g. negating the minimum value is overflow. Perhaps there
should be two signed compile-time modes, one in which those can
overflow but not the bitwise operators ~ | ^ &, and another mode
without that minimum value, in which normal arithmetic would be
"safer" but bit-operations a bit less so (but you should know what
you're doing if you're using them on signed values anyway).

Conceivably a wrapper something like this could be used to wrap
"integer-valued doubles that are small enough to represent each
consecutive integer" as odd-sized integer values computed by FPU.
But (x < 1) where x is a double may not even give the same answer
each time, apparently! So I don't have the skills to do anything like
that.

ideas of what people might want for overflow behavior:
- boost::optional<int> to have non-throwing failure, similar to
      floating-point NaN.
- clamping to the endpoints (overflow becomes the max value
      defined by the bit-size, underflow the min value, and
      if division by zero is one of the things detected... well that's
      something a policy will have to say).
- modulo-arithmetic for any number of bits you want, even if
      it's not a multiple of CHAR_BIT.
- obviously, exception throwing or assert failure

In the process I found/invented an int-log-base-two implementation
that seems it should be decently fast at runtime and also
works as an integer constant expression. (although with GCC
it can detect with __builtin_constant_p and bitsizes, whether it
can change that to __builtin_clz[l[l]], so it can be often be turned
into a special processor instruction at runtime for many
architectures including x86 and powerpc; do other compilers
have facilities like that?).

Unfortunately I've been struggling a little with the int-wrapping
template classes, but I thought I'd ask if there was interest before
struggling to make sure my code approached Boost-quality rather
than just "good enough for me" quality :-)

cheers,
-Isaac


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk