Boost Users :

Date view	Thread view	Subject view	Author view

From: Zeljko Vrba (zvrba_at_[hidden])
Date: 2008-08-17 02:32:40

Next message: Ion Gaztañaga: "Re: [Boost-users] [Boost.Interprocess] int key and flat_multimap_index"
Previous message: Asif Lodhi: "Re: [Boost-users] size_type doubts / integer library.."
In reply to: dizzy: "Re: [Boost-users] size_type doubts / integer library.."
Next in thread: Michiel Helvensteijn: "Re: [Boost-users] size_type doubts / integer library.."
Reply: Michiel Helvensteijn: "Re: [Boost-users] size_type doubts / integer library.."
Reply: dizzy: "Re: [Boost-users] size_type doubts / integer library.."
Reply: Kim Barrett: "Re: [Boost-users] size_type doubts / integer library.."
Reply: Asif Lodhi: "Re: [Boost-users] size_type doubts / integer library.."

On Sat, Aug 16, 2008 at 11:50:05PM +0300, dizzy wrote:
>
> I do not agree. They generally do their job fine (which is provide portable
> support to work with native, non checked, platform integer types). For any
> other needs you should probably use another tool.
>
Well, they do *not* do their job fine: (-1U < 2) == false, which is a mathematical
nonsense (more on that below).

Signed arithmetic overflow is undefined behavior, and some CPUs actually raise
an exception on overflow (e.g. MIPS). Every 'a+b' expression, with a and b
being signed integers, is potential UB. Some machines (e.g. x86) do not raise
an exception but set an overflow flag which may be tested by a single
instruction, yet I don't know of a compiler which offers run-time overflow
checking as a code-generation option. Portable checks for overflow (relying on
bit operations) incur immense overhead (certainly much greater than a single
instruction).

[mail rearranged a bit]

>
> You can research how to turn off that warning with your compiler only in
> specific parts of the code. Obviously the easiest way is explicit conversion.
>
> > Or should I just listen to the devil on my shoulder and turn off the
> > appropriate warnings?
>
> Obviously so if you are doing correct code. By your reasoning one should find
> a solution for the common if(a = b) warning other than:
> - turn off the warning
> - or wrap with another set of paranthesis if((a = b)) (the equivalent of the
> explicit conversion in your case)
>

Writing an extra set of parentheses is not visually intrusive or cluttering.
Writing static_cast<int>(expr), (int)(expr), or surrounding the offending code
with #pragma _is_ cluttering and intrusive. Yes, I want my code to be short,
concise, and easily readable, in addition to being correct. So shoot me :-)

I have researched comp.lang.c++.moderated archives on this topic, and other
sources, and found two advices:

Peter van der Linden in "Expert C Programming: Deep C Secrets" writes:

"Avoid unnecessary complexity by minimizing your use of unsigned types.
Specifically, don't use an unsigned type to represent a quantity just
because it will never be negative (e.g., "age" or "national_debt")."

A quote of Bjarne Stroustrup: "The unsigned integer types are ideal for uses
that treat storage as a bit array. Using an unsigned instead of an int to gain
one more bit to represent positive integers is almost never a good idea.
Attempts to ensure that some values are positive by declaring variables
unsigned will typically be defeated by the implicit conversion rules."

Yet, all of the STL uses an unsigned type for size_type and the return value of
size(). As much as I'd like to use only signed ints, this becomes prohibitive
(due to warnings) when mixing them with STL. An yes, I've been bitten several
times in the past by implicit signed -> unsigned conversions in relational
comparisons. The most sane thing would be to throw an exception at runtime if
one compares a negative integer with some unsigned quantity, instead of getting
false for '-1 < 2U', which is a mathematical nonsense. signedness of a type
*should not* affect its numerical value in mathematical operations.

>
> > Another example: an external library defines its interfaces with signed
> > integer types, I work with unsigned types (why? to avoid even more warnings
> > when comparing task IDs with vector::size() result, as in assert(task->id <
> > tasks.size()), which are abundant in my code).
>
> You realise you can fix just that with a helpful function wrapper and you
> don't need to change whole interfaces and redesign your code because of a
> warning of your compiler being too picky with perfectly fine code right?
>
I *do* realize that. However, simple wrappers won't fix operators. I like
having as many low-overhead sanity checks in my code as possible, but I don't
want to litter the code with a bunch of assert()s in front of every arithmetic
expression. Should I replace every 'if(a < b)' with 'if(less(a, b))' and
provide a bunch of overloaded less() functions?

>
> The warnings were ment to be helpful. If they are not, turn them off.
>
If I'm going to turn off that particular warning, I want to compensate with
extensive run-time checking, at least in the "debug" version. There are _good_
reasons that warnings about 64->32 bit truncation or comparisons of different
signedness exist.

>
> Supose std::vector<T>::size_type is such a checked integral. Since it's
> suposed to represent the maximum size possible in memory (actually it's just
> the size_type of the Allocator argument but I'm talking about the default one)
> then it would be something like integer<32, 0, 4294967295>. Then, your thread
> ID is something like integer<16, 0, 65535>. Then the compile time checks you
> talked about woudl not allow you to assign values from what vector.size()
> returns to your thread IDs. So what you actually do is get to the same old
>
Oh, the library would allow conversion from integer<32, 0, 4294967295> to
integer<16, 0, 65535>, and insert an appropriate run-time range-check. So
types integer<N1,L1,H1> and integer<N2, L2, H2> would be compatible if the
intersection of [L1,H1] and [L2,H2] is not empty; the resulting type would
be coerced to integer<max(N1,N2), min(L1, L2), max(H1, H2)> (though other
bounds are necessary for other operations, e.g., addition), with a run-time
check inserted (this makes it possible to maintain rather lax bounds at
compile-time).

===

Actually, I think I could simplify my requirements a bit:

  - define template class integer<T> with T == char, short, int, long, long long
    (no unsigned types allowed!)
  - allow initialization from _any_ integer<T> or primitive type, larger or
    smaller, signed or unsigned, provided it is in the range of T; throw
    exception otherwise
  - allow conversion to _any_ underlying integer type, signed or unsigned,
    provided the value fits; otherwise throw an exception
  - allow mixed arithmetic between mixed integer<T> classes (e.g. integer<int>
    + integer<char> would be defined) as well as between integer<T> and any
        primitive type, subject to the conversion rules above
  - arithmetic would be checked for overflow
  - comparisons between integer<T> and unsigned types are allowed as long
    as integer<T> is positive or 0; otherwise an exception is thrown
  - bit manipulation operators would behave as if T were unsigned; an additional
    method or function signed_shift(integer<T>) would be provided
  - arrays of integer<T> must have the same low-level layout as arrays of T

(i think this covers all the requirements)

Example:

integer<int> a;
integer<short> b;
short result = a + b + 7;

This code would convert b to integer<int>, check for overflow before (or
after[*]) adding it to a, compute a+b, check for overflow before adding 7, add
7 to the result, check that the result fits into short, done. If any check
fails, an exception is thrown.

[*] Overflow checks could be implemented in platform-specific assembler;
e.g. x86 sets the overflow flag _after_ addition.

This is the semantic that I'd like to have in debug versions; in "release"
version of the program, everything would behave as ordinary arithmeic on
primitive types. I'm studying numeric_cast<>, but.. it covers only
conversions, not other items listed above.

>
> There is a lot of possible dangerous code that can be done in C or C++ and
> some compilers tell you about it but that do not make it wrong code. Warnings
>
The problem is that every 'a+b', with a and b signed, is dangerous (potential
UB), unless you either 1) formally *prove* that the expression won't overflow
or 2) insert extra run-time checks (which clutter the code). :/

Something as common as simple addition should at least have an _option_ of
*always*, under *all* circumstances having defined behavior. The integer<>
class proposed above is just one of possibilities; but that should have
been included in the standard :/

Next message: Ion Gaztañaga: "Re: [Boost-users] [Boost.Interprocess] int key and flat_multimap_index"
Previous message: Asif Lodhi: "Re: [Boost-users] size_type doubts / integer library.."
In reply to: dizzy: "Re: [Boost-users] size_type doubts / integer library.."
Next in thread: Michiel Helvensteijn: "Re: [Boost-users] size_type doubts / integer library.."
Reply: Michiel Helvensteijn: "Re: [Boost-users] size_type doubts / integer library.."
Reply: dizzy: "Re: [Boost-users] size_type doubts / integer library.."
Reply: Kim Barrett: "Re: [Boost-users] size_type doubts / integer library.."
Reply: Asif Lodhi: "Re: [Boost-users] size_type doubts / integer library.."

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net