Boost logo

Boost :

Subject: Re: [boost] [boost::endian] Request for comments/interest
From: Terry Golubiewski (tjgolubi_at_[hidden])
Date: 2010-05-28 16:27:14


----- Original Message -----
From: "Tomas Puverle" <tomas.puverle_at_[hidden]>
Newsgroups: gmane.comp.lib.boost.devel
To: <boost_at_[hidden]>
Sent: Friday, May 28, 2010 11:05 AM
Subject: Re: [boost::endian] Request for comments/interest

> Thanks Dave,
>
>> 2) To copy or not to copy.
> <snip>
>
> Dave brings up an important example which I'd like to expand on a little:
>
> Suppose your application generates a large amount of data which may need
> to be
> endian-swapped.
>
> For the sake of argument, say I've just generated an 10GB array that
> contains
> some market data, which I want to send in little-endian format to some
> external
> device.
>
> In the case of the typed interface, in order to send this data, I would
> have to
> construct a new 10GB array of little32_t and then copy the data from the
> host
> array to the destination array.

Since IP packets cannot be 10GB, I submit that you're going to have to break
your 10GB array down into messages. Then you're going to copy portions of
the 10GB array into those messages and send them. In the type-base
approach the message may indeed contain an array.

   boost::array<endian<little, uint32_t>, MaxFragmentSize> buffer;

That you copy fragments of the 10GB array into before sending, and then on
the receiving size, copy them out.
The user on either side of the interface can extract the data from the
fields without knowing the endianness of the field or the endianness of the
machine he's working on.
He doesn't have to know to call a swap function. He just extracts the data
using the standard copy algorithm. The conversion happens automatically by
implicit conversions.
One copy into each message. One copy out. What could be better than that?

>
> This has several problems:
> 1) It is relying on the fact that the typed class can be exactly overlaid
> over
> the space required by the underlying type. This is an implementation
> detail but
> a concern nonetheless, especially if, for example, you start packing your
> members for space efficiency.

In the example I posted, on non-native machines, an object "T" is
represented inside of endian<endian_t, T> as "char storage[sizeof T]".
Provided that the compiler provides some kind of "packed" directive (all
that I use do), then field alignment isn't an issue.
Doesn't swap_in_place<>() make the same assumption of overlaying types?

> 2) The copy always happens, even if the data doesn't need to change, since
> it's
> already in the correct "external" format. This is useless work - not only
> does
> it use one CPU to do nothing 10 billion times, it also unnecessarily taxes
> the
> memory interfaces, potentially affecting other CPUs/threads (and more, but
> I
> hope this is enough of an illustration)

In the message-based interfaces that I am used to, one always must copy some
data structures into a message before you send it.
After all, if you're using byte-streams, then endianness doesn't really
apply.
There is always at least one copy into the message. The typed-interface
only requires one copy of data into each message.
In both techniques you have to copy the information out of the message, if
you use it, at least one time. The problem with the swapping mechanisum is
that the swap, requires a write and a read from every location, before you
even read it, whether you actually read the fields or not. And/or, the user
has to remember whether he/she has already swapped each field. Since
messages are often passed from one protocol layer to the next, usually
written by different authors, I shudder to think of the integration
experience. The typed method requires one read from each memory location no
matter what the endianness is. (IUnfortunately, in the case of poorly
optimizing compilers, the read on non-native machines may actually make two
copies.) The only efficiency issue with the typed interface is that
non-native-endianess values are read out in reverse order byte-by-byte,
where the native endian fields can be read out of the message more
efficiently using word-sized and aligned data transfers.

> swap_in_place<>(r) where r is a range (or swap_in_place<>(begin,end),
> which is
> provided for convenience) will be zero cost if no work needs to be done,
> while
> having the same complexity as the above (but only!) if swapping is
> required.
> With the swap_in_place<>() approach, you only pay for what you need (to
> borrow
> from the C++ mantra)

With the typed-approach you only pay for the message fields that you read.
No extra work is required on native-endian machines.
I think the typed-approach actually fits the "only pay for what you use"
mantra better.

I get the impression that I'm missing something. If you're game, I'd like
to consider a real-world use-case that uses multiple endians and has
different protocol layers.
That is one over-the-wire packet has several layers of headers, possibly
with different endian alignment than the user payload contained. This is
common on PC's which often have big-endian IP headers and then have a
little-endian user payload. The whole packet is read in from a socket at
once into a data buffer owned by a unique_ptr, so the message is not copied
from layer-to-layer. I work on proprietary, non-internet networks, so I'm
not sure which protocol headers we should use for a use-case. In my
wireless applications, the headers are usually padded to an integral number
of bytes, but fields within the headers are sometimes not byte-aligned.

We're only considering byte-ordering here too. An equally important part of
the endian problem for me, is the bit-ordering. For this I use a similar
technique for portable bitfields

bitfield<endian_t, w1, w2, w3, w4, w5, ...>

I'm not sure yet how your swapping technique would affect that.

If we can find the time, I think our discussions would benefit from a
concrete example to measure against.

BTW, I like the interface design of your library and the way you use macros
and iterators to ease the swappability of classes, including inheritance.
I'm arguing against swapping though because I've been using the type-based
method (but not Beman's exactly) successfully for a long time. I'm a very
biased. :o).

terry


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk