Boost logo

Boost :

From: Jeffrey C. Jacobs (darklord_at_[hidden])
Date: 2002-10-01 12:13:16


Sorry it took me a back to get back to y'all as I did reintroduce this. :)

Having thought about it some more, I think I'm in favor of John's approach:
ignore the __int<n> types and build int_t<n>, etc. purely on the current
types available in ISO C and C++ (since ISO C supports long long and C++
'03? likely will too) [with the exception I present later]. E.g.

// No <6>
long long <5>//not a typedef; #ifdef BOOST_HAS_LONG_LONG
long <4>
int <3>
short <2>
signed char <1>

[Note: I switched the order since it makes more sense for "larger" types to
expand towards higher specializations. Also if you make "unsigned"
equivalents negative specializations you can grow each in parellel without
having to renumber each time.]

As for signedness, I suppose with my sign_traits type library you could let
that handle sign and build the template in terms of that, for example:

// Implementation detail
# if defined(BOOST_HAS_LONG_LONG)
# define BOOST_MAX_INTEGRAL_TYPES 5
# else // BOOST_HAS_LONG_LONG
# define BOOST_MAX_INTEGRAL_TYPES 4
#endif // BOOST_HAS_LONG_LONG

  template< int Bits > // bits required
  struct uint_t
  {
      typedef typename int_least_helper
        <
          BOOST_MAX_INTEGRAL_TYPES -
# if defined(BOOST_HAS_LONG_LONG)
          (Bits <= std::numeric_limits<unsigned long long>::digits) -
#endif // BOOST_HAS_LONG_LONG
          (Bits <= std::numeric_limits<unsigned long>::digits) -
          (Bits <= std::numeric_limits<unsigned int>::digits) -
          (Bits <= std::numeric_limits<unsigned short>::digits) -
          (Bits <= std::numeric_limits<unsigned char>::digits)
>::least sleast;
      typedef typename sign_traits<sleast>::unsigned_type least;
      typedef typename int_fast_t<least>::fast fast;
  };

#undef BOOST_MAX_INTEGRAL_TYPES

As with the other integer classes, sign_traits should be compile-time and
incur no additional run-time overhead. It would eliminate the unsigned
specializations in the uint_t template and would allow you to simplify the
int_least_helper hierarchy to just signed (or unsigned) types [I would
probably go with the unsigned forms being the default though; reverse of
what I've written above, but that's cosmetic.]

That having been said, as long as long long [PNI] is inserted in the proper
order, I don't have a problem with an enumeration of unsigned types
following the signed ones, plus it doesn't require the addition of a new
library (all be it one I wrote and would strongly promote :D ) I would say
that maybe every few years we need to add new specializations if they
continue adding intrinsic types (long long long, long long long long,
megalong, hyperlong, superlong, ultralong, etc.) [as an aside, MS IDL
defines the "hyper" keyword to be a 64-bit, signed integer and DOES make a
good, bit-unspecified, substitute for long long were it not for its
limitation to IDL] but I say we deal with each as they come, and as long as
they are always greater or equal the last one, the update is simple (when we
increase towards +/-infinity). Long long seems headed for standardization
so I'm keen on seeing it added, but other than that, I suggest we keep it
simple.

I will drop a big hint that I have a specific, template "wickedlong"
container for integral-like types that may make a great companion to the
int_t templates in cases when too many bits are requested but more than that
I am not at liberty to divulge... :)

As for "long" being bigger than "long long", reading the section of the C 99
standard on limits, it seems that it's possible for a 96-bit short to be
bigger than even a 64-bit-minimum long long (char is fixed at 8 bits).
OTOH, IMHO this is a case where common sense supercedes the standard.
Obviously it WOULD be possible for an implementer to make short the
"biggest" type but to do so would be semantically anachronistic, if not
anarchist. Not only that, but in "The C++ Programming Language", Stroustrup
states specifically the sizes of the C++ types in terms of each other, not
just in terms of the minimums specified in the ISO C 99 section on
<limits.h>, namely, in section 4.6:

1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) [<=
sizeof(long long) by implication]
1 <= sizeof(bool) <= sizeof(long) [may become bound by long long]
sizeof(char) <= sizeof(wchar_t) <= sizeof(long) [may become bound by long
long]
sizeof(float) <= sizeof(double) <= sizeof(long double)
sizeof(<N>) == sizeof(signed <N>) == sizeof(unsigned <N>)

[As an aside: any thoughts on a float_t<mantissa_bits, ordinate_bits> type??
:)]

He further states the minimum bits for each type (8, 16, ?, 32, ?) where int
and long long we can assume from ISO C 99 would be 16 and 64, respectively
giving C 99's (8, 16, 16, 32, 64) minimums. He also states that a char is a
MINIMUM of 8 bits, where as C 99 states it must be EXACTLY 8 bits. I think
it makes more sense to set a minimum bits for char to allow for "wacky
architectures" though who would develop a 10-bit char I've no idea. [Note,
that says nothing of the sizeof result since sizeof(char) is always 1, but
if char is 64-bits, theoretically you COULD have sizeof: (1, 1, 1, 1, 1) for
each type, though not in C 99.] Anyway, sizeof doesn't directly effect us as
long as the numeric_limits is correctly specialized. And for us, using
"Stroustrup's rules" works perfectly! since they are exactly the hierarchy
we are assuming. So my vote is: go by Stroustrup's hierarchy and following
his logic to assume s::nl<u long long>::digits >= s::nl<u long>::digits.
(And equally that long long can be more than 64 bits.)

I would say though that we can't include wchar_t or bool in the
int_least_helper since they are unrelated to the integral hierarchy. And
the same logic, IMHO, goes for __int<N> since they fit in an entirely
separate hierarchy from this.

Thanks BTW John for mentioning __int128. I did a few tests with MS's name
mangling rules and it turns out even though they removed the keyword (and
DEC Alpha) support in MSVC 6.0, it IS part of the name mangling. In fact,
ALL the __int<N>'s have a separate name mangling that likely predates the
addition of "bool" to ANSI C++ (though why they mangle bool as "_N" and not
"_B" has me stumped). The thing of note is, although the compiler is
treating __int8 like char and __int16 like short, the object file and linker
are treating each as separate types, a difference between (C, E, F, G, H, I,
J, K) and (_D, _E, _F, _G, _H, _I, _J, _K, _L, _M) for the ISO and
MS-specific types respectively. [No MS placeholder for long long mangling,
yet... or for signed/unsigned wchar_t (_W).] It's strange that the compiler
lets __int8 serve as a place holder for [signed] char IMHO too, especially
when in function and typeid mangling they are separate types (and they don't
explicitly overload the basic_string, etc. templates for __int8).

So anyway, for the Microsofties out there, it certainly is handy to know
this hierarchy:

// Grow <6>
__int128 <5>//#ifdef BOOST_HAS_INT128
__int64 <4>
__int32 <3>
__int16 <2>
__int8 <1>

[Alternatively, you could define each directly in terms of its bits since
this is known ahead of time but this makes the definition of int_t harder
AFAICT, likely using more intense MPL.]

And by extension, on the MS platform I might EXPECT int_t<16>::fast to be an
alias for __int16 NOT short. OTOH, I might expect on MS for int_t<16>::fast
to return short. IMHO, someone might want the ISO types OR the sized types,
but not typically "both". Especially since they can't be mixed (we have no
guarantee that short is 16 bits, long is 32, long long 64 even in the MS
world (MS for DEC Alpha for instance) and thus a definite ordering of the
two merged would be impossible.

Having thus considered the issue, and IMHO it IS worthwhile to take
advantage of MS's sized types, I would thus suggest one of the following:

A) #if defined(BOOST_HAS_MICROSOFT_SIZED_INTS) [and BOOST_HAS_INT128] such
that the entire integer.hpp will operate differently on MS and Non-MS
platforms, with non-MS giving ISO types and MS giving JUST the __int<N>.
The MS would use the above __int<N> hierarchy with int_t being defined in
terms of actual number of bits since std::numeric_limits won't be defined
for all the __int<N> types. Then again, we know the bits from the name so
using exact bit counts is actually easier to write and faster for the
compiler. It might look something like:

//...
      typedef typename int_least_helper
        <
#if defined(BOOST_HAS_INT128)
          (Bits <= 128) +
#else // BOOST_HAS_INT128
          1 +
#endif // BOOST_HAS_INT128
          (Bits <= 64) +
          (Bits <= 32) +
          (Bits <= 16) +
          (Bits <= 8)
>::least uleast;
//...

Note: integer_traits would need to be expanded to give compile-time values
for const_max and const_min for the __int<N> types for this to work.

B) Add a compile switch (#define) to indicate which encoding the user
prefers, defaulting to ISO types if the MS types are not available.
Basically something like

#if !defined(BOOST_DONT_USE_MICROSOFT_SIZED_INTEGERS) &&
defined(BOOST_HAS_MICROSOFT_SIZED_INTEGERS)
//Code for Microsoft-specific int_t, etc.
#else // !BOOST_DONT_USE_MICROSOFT_SIZED_INTEGERS
//Current code for integer.hpp (with long long)
#endif // !BOOST_DONT_USE_MICROSOFT_SIZED_INTEGERS

The BOOST_DONT_USE_MICROSOFT_SIZED_INTEGERS could be a "positive" form if we
wanted to always default to ISO types.

C) Add a new MS-specific header file similar to integer.hpp but with the
MS-Specific definitions on different typedefs, such as replacing int_t with
__int and uint_t with u__int, etc. This file (and these types) would either
not be defined on non-MS platforms [YUCK!] or would simply be typedefs for
the existing integer.hpp library [messy but effective], which is then
similar to solution A only with two separate types: one that is ISO only,
the other preferring MS but using ISO if MS is not available.

Although B is the messiest in terms of requiring active input from the
client code (define this if you always want to use ISO types, otherwise on
MS you get MS types -- or the "positive" option to force the user to define
something if they WANT MS-types on MS) it is the one I would vote for.
After all, the MS-types ARE specifically geared for exactly what int_t is
trying to accomplish, and so I'd probably want to use int_t<n>::fast in
place of __int<N> in my MS code to make it Boost-portable but then maybe not
expect switching to int_t to be in any way different from the prior code
using __int<N> except by aliasing.

Now, as a final thought I would like to mention that although because all
"sizes" (sizeof) have to be in terms of char (at least/exactly 8 bits), I
may not expect to get a 23-bit int from uint_t<23>::least but if I was
writing code on a platform where the int and short were both 3-bytes,
24-bits (possible by the standard AFAICT) and constantly requested
uint_t<24>::least, yielding "int" on my first platform and then wanted to
port this code over to a 32-bit machine, would it be a problem that
uint_t<24>::least still returns an "int" but that int is now 32 bits? What
if my code was for a device driver that always expected 24-bit words? How
could I specify a true 24-bit type that is native on the weird platform but
simulated on 32-bit. Granted I could use masking and bitfields to force a
"smaller type" and in fact that may be one nice alternative to int_t if it
were not for the fact that your variable would now be a bitfield, e.g.:

template< int Bits >
struct uint_t
{
    //... least, fast
    struct exact
    {
        least field : Bits;
    }
}

uint_t<24>::exact x;
x.field = <some 24-bit number>;

But then you are still passing around objects of size "least" and there is
no guarantee bitfields can be extended this way or to Abstract Data Types.
Granted ISO types will always be "fast", but if you REALLY want a 3-byte,
24-bit type, would it not be handy if you could always get one, such that it
was native on platform "Weird" and "simulated' (say using a fixed-size
template container similar to valarray)? These are the type of things I
think about when I go to bed at night! :D

Well, all that hope you aren't TOO overloaded Stephen. Definitely a LOT of
detail!

Jeffrey.

"John Maddock" <jm_at_[hidden]> wrote in message
news:01f801c2693d$c1753240$09a4193e_at_1016031671...
> > Presumably both could be available and of different sizes? Yuck!
>
> Maybe :-) but unlikely for now, although I believe that Win64 has a
__int128
> type (or will have soon), whether this is the same as long long on that
> platform I don't know...
>
> On the whole I think you should ignore __intXX types provided long long is
> available, but who knows what platforms may do in future...
>
> John Maddock
> http://ourworld.compuserve.com/homepages/john_maddock/index.htm
>
>
> _______________________________________________
> Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk