Subject: Re: [boost] [boost::endian] Request for comments/interest
From: Dave Handley (Dave.Handley_at_[hidden])
Date: 2010-05-28 10:55:46
I wanted to throw a few things into the discussion:
1) Typed vs untyped. We have had various discussions about whether the network endian object should be a different type to the machine endian. I have a pretty strong opinion on this, and I think you should work with the same type. There are 2 core reasons for this:
a. Performance when network endian is the same as machine endian - I'll discuss more on this later.
b. When you are reading a complex packet of data off the network, a common use case would be to reinterpret_cast a block of data into a struct. When you are using different types for network and machine endian you end up having 2 structs - one to read the data off the network, and one to copy the data into. This could turn into a pretty severe maintenance burden.
c. Given that you tend to convert to machine endian at the boundaries, you can treat the untyped interface in a very similar way to a boost::shared_ptr style interface. You call swap_in_place on the return value from the function that creates the initial struct - and therefore you never have access to the network endian version in anything except your most basic read function. As such, I think the type safety of network endian data is a bit of a red herring. It is no different from creating a raw pointer, then half a dozen lines of code later constructing a boost::shared_ptr from it. It is against the principle of the interface, and would create some evil bugs, but hasn't been formally prohibited by the interface.
d. Finally, I would definitely want an endian interface that operated at a high level of abstraction - ie at the struct or container level, not at the built-in type level.
2) To copy or not to copy. I have a very big issue with any interface that enforces a copy. If I'm writing something to live on a memory limited embedded device, I absolutely want to be able to endian swap in place. Secondly, we should definitely not accept an interface that copies just on the grounds that the most common operation is from a big endian network to a little endian machine that does always require mutation. The platforms that boost runs on includes plenty of big-endian machine types; and I have come across numerous cases where (for performance reasons on a largely little endian machine environment) data has been left little-endian on the network. Having a do nothing (not even copy) case for when machine and network endianness match is very important. If you are reading a big stream of data from the network, not copying anything is a very important case, and key for writing high performance production code.
3) A final thought is that I am much more keen to use template arguments to decide on which endian swap to use than explicitly coding them in function template names. It makes endian swapping code in a big code base easier to spot, and makes the interface of the endian swapper itself more logical. When I change from big to little endian, I am tweaking a parameter of the endian swapper, not calling a wholly different function. I think from a design perspective, a template parameter makes most sense.
Overall, I like Tom's interface, but I have been using it for quite a while (as a disclaimer).