From: Darin Adler (darin_at_[hidden])
Date: 2000-03-11 14:11:03
I should preface this by making it clear the context I bring to this. In
1998 and 1999 I did many projects that involved porting software from
Windows to Mac OS. The original Windows programs often had code that assumed
little-endian byte ordering for data read from files, for example.
With classes like the ones Mark posted, I was able to fix endian assumptions
easily, often by just changing the types in a struct definition without
touching another line of code to deal with the endianness issue.
>> 1) It's possible to write an implementation that does not require a
>> LITTLE_ENDIAN_MACHINE #define to work properly by reading the data a byte at
>> a time. It might not be as efficient but it's more portable.
> I would prefer keeping it the way it is because it keeps the
> implementation as efficient as possible for a given platform. If the
> machine is little-endian, no extra work goes into using
> little_endian<int>, and vice versa.
I think this is premature optimization.
In all the contexts where I've used these classes in real life projects,
performance has never been an issue. Correctness and getting the proper
configuration macros defined on the other hand was a major headache in
getting the ported code to work.
Any performance gain is only on some of the targets (the ones where the byte
ordering matches) and so you can't "count on it" anyway.
I know that some people aren't happy if a possible optimization is left
undone. For the happiness of those who feel that way, we can provide
template specializations for cases that can be done efficiently because they
match native order. But those need only kick in if the endianness of the
platform is known. If it's not known, we still have code that works and
(There are further optimizations possible, too. On the PowerPC, for example,
there are special instructions for reading foreign-endian integers, with
easy access in the Metrowerks C++ compiler. By specializing to use these,
even little_endian<int> can be as fast as a plain integer for simple
operations on a big-endian PowerPC machine.)
The speed gains from such specializations were never measurable in the
places in real programs where I have used these classes.
> Wouldn't this be a good place for compile-time asserts?
You can use compile-time asserts, but it's better to write code that's
guaranteed to work unless you can obtain a compelling advantage by assuming
something and asserting it.
Here's an example: In the reverse_bytes functions you presented, a template
version can be constructed that selects the proper implementation without
making assumptions about the sizes of the built-in integer types. An
underlying template function that takes the size of T as a template
parameter is partially specialized (is that the right term?) for the 1-, 2-,
and 4-byte cases. The COMPILE_TIME_ASSERTs that check the integer sizes are
>> 2) I've also found it useful to have "non-aligned" numeric templates that do
>> the same thing but are made out of arrays of characters. This is useful when
>> I'm making structures to match data structures with alignment characteristics
>> different from the platform/compiler I'm using. I'll explain further if this
>> is not sufficiently clear.
> Sounds interesting. Can you do this without the use of compiler tags
> like "#pragma pack" or "__attribute__ aligned" ?
Of course! It' done without using non-portable features or even the
preprocessor. My existing implementations don't use templates, but it would
be easy to make nicer ones that are template based.
This kind of implementation simply reads a byte out of an array and does <<8
and then reads another byte. On the write side it writes into the char array
and does >>8.
>> 3) I see no reason to omit the other assignment operators like +=, -=, ++,
>> and --. In a library for myself I might not bother, but for something I'm
>> sharing I'd like that.
> My original thinking was that to do so would mislead the user into
> thinking that ++n would be faster than n = n + 1. Which is not really
> true for the case where byte order does not match the native type's.
> Disallowing these operations was my way of telling the user, "If you
> are really worried about the speed of arithmetic operations on this
> data, convert it to a native type". I think the template versions
> perform reasonably fast, but I don't want people thinking that they
> will be every bit as fast as the native types.
Perhaps my original statement, "I'd like that", was not strong enough.
In the programs where I used these classes, it was a big advantage to have
the operators defined. (I typically defined them as the need for each one
cropped up.) If the appropriate operator is defined, then there is no need
to change any code other than the place where the integer is declared
(usually in a struct definition). If the operator is not defined, the code
that manipulates the integer must be changed, which is more error prone.
I got big wins in doing endian fixes in programs in this way. I'm convinced
that the ports were done faster than they would have been otherwise; I was
able to resist the urge to redesign and recode big parts of the original
Re: Efficiency, I'd like to point out that these operations on these objects
are still fairly quick and at most O(size of the integer). As I have said
repeatedly, the efficiency was never an issue in practice.
> The other reason I didn't add the operators is I think its cooler than
> cheez-whiz that a class can do so much with just a constructor and a
> single operator. :)
I agree it is cute, but it's not a compelling reason. I'd rather keep the
client code simple and have this library code be a bit more complete and
I think I'm going to make a cut at these endian-specific classes given my
different understanding of the requirements. I'll post it to the list when
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk