Boost logo

Boost :

From: Yuval Ronen (ronen_yuval_at_[hidden])
Date: 2006-06-09 10:45:15


Beman Dawes wrote:
>> what I suggested before (and obviously failed to convince): There should
>> be a set of Integer types for various sizes/alignments, which could be
>> used without any relation to endianness (which probably means native
>> endianness, just as using a simple 'int' or 'uint32_t' means native
>> endianness).
>
> What I'm missing is the motivation. Other than for endian I/O, I'm not
> able to visualize any need for integers of various sizes/alignments
> beyond those already provided by <cstdint>.

They are needed for the exact same reason you wrote class endian in the
first place.

You described your motivation as dealing with large files containing
records. These records could contain integer types, and you wanted to be
portable, and therefore store them in a declared endianness, rather than
an unknown native endianness. You also wanted to be very economical with
storage requirements, so you used an weird-sized, unaligned integers.

That's perfectly fine. Now lets just take this exact example, and remove
the need for portability. If I write code that I *know* will run on a
homogeneous set of platforms, and I want to save the performance penalty
imposed by the non-native endianness, then I'd like to use native
endianness. The need for weird-sized unaligned integers to save space
didn't disappear. It's still there. Hence the need for these Integer
types for various sizes/alignments.

Bottom line, is that I believe the need for these integer types exists
(for space efficiency, or other reasons) even without the endianness
specifications, and the latter should be built around them, and not
interleaved with them.

> In any case, such types would seem to fit better into an integer library
> than a library providing endian byte-holders.

Absolutely, that's what I was saying. These types should reside in
Boost.Integer, and Boost.Endian should just accept them (and others) as
template parameters.

>> - I think that using bits numbering is better than bytes, because a)
>> uniformity with the types in <cstdint> is *very* important, IMO and b)
>> as some noted, the size of a char is not necessarily 8 bits (so help me
>> God if I understand why this is more useful than harmful), so bits
>> numbering is less ambiguous than bytes (and maybe this is the reason why
>> it was chosen to be used in <cstdint>).
>
> <cstdint> is about integers, where the number of bits is critical, even
> if not exactly a certain number of bytes.
>
> <boost/endian.hpp> is about endian byte-holders, where the number of
> bytes is critical, even if not exactly matching the architecture's
> integer number of bits.

boost/endian is not about integers? How can it be not? The *only* area
where endianness is relevant is with integers. A buffer of bytes has no,
and doesn't need any, endianness. That's why I think an *integer* type
is the parameter to the endian classes. It seems we agree on that,
because your code does exactly this - passes integer types to the endian
class.

>> Actually, it just occurred to me
>> that if portability between different platforms (with different
>> CHAR_BITS) is our main concern here, then it *must* be bits, isn't it?
>
> CHAR_BITS is fixed at 8. It never varies.

I'm certainly not a standard expert, but several posters in this thread,
and in the Boost.Asio review thread, claimed that CHAR_BITS can be
larger than 8. I had no knowledge of my own here, so I relied on it. If
this is wrong, then I am wrong as well. My apologies for that.

> It really sounds like your concern is applications involving integers,
> and an endian class is the wrong tool to solve your problem. Is that
> possibly the case?

My concern is applications involving integers, that's correct, but I see
no contradiction between this and the endian topic. As I explained
above, I believe concern in applications involving integers is in fact a
pre-requisite for dealing with endianness.

>> - Is aligned more common than unaligned, or vice-versa? It sounds
>> logical to me, that since the POD integers types (int and friends) are
>> aligned, it should also be the 'default' behavior of any class mimicking
>> them, including of course, the endian class. The conclusion is that
>> instead of prefixing 'a' or 'aligned_' to the aligned types, the
>> unaligned types should get a prefix ('unaligned_'?).
>
> Unaligned (including the very common sub-cases of aligned by
> happenstance, careful placement, or padding) covers the vast majority of
> the uses in my experience. Forced alignment is a (somewhat dangerous)
> optimization that I would not recommend except to endian experts who
> understand the risks involved.

Let me understand, are you saying that using an int somewhere is
"aligned by happenstance", and therefore considered "unaligned"?

>> - Having an enum with values such as 'big', 'aligned_big', 'little',
>> 'aligned_little', etc, just cries for separation. The enum should have
>> only 'big' and 'little', and the endian template can accept one more
>> template argument - 'bool aligned'.
>
> My initial implementation did have an additional template argument,
> taking an enum:
>
> enum alignment { unaligned, aligned };

Looks excellent.

> But having an additional argument meant that defaulting didn't work
> well. It is nice to be able to default the lengths for aligned.

I have to admit that I don't understand how adding the 'enum alignment'
as a first or second template argument (before or after the 'endianness'
argument) caused any problems with the default length argument. Sounds
harmless to me.

> Thanks,
>
> --Beman

My pleasure,
Yuval


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk