Subject: [boost] [endian] Review
From: Stewart, Robert (Robert.Stewart_at_[hidden])
Date: 2011-09-13 15:18:39
Herewith is my review of Beman's Boost.Endian library.
What is your evaluation of the design?
The two-argument overloads of the [un]conditional swapping functions seem awkward at first blush, but are really the best interface for those operations. Changing to value returning functions opens the door to mistakes like the following:
uint32_t const in(123);
int32_t const out(reorder(in));
There might be a warning, but many programmers routinely ignore such.
As is, the compiler will complain about the conditional functions:
uint32_t const in(123);
native_to_big(in, out); // Error: ambiguous
The unconditional functions do not provide that benefit because they aren't function templates:
uint32_t const in(123);
An interesting variation that I've used in the past is a set of cast-style function templates that permit byte swapping one type and returning the result as a different type, after verifying that the resulting value is in range of the target type. Thus,
uint32_t const in(std::numeric_limits<uint32_t>::max());
int32_t const out(to_host<int32_t>(in)); // throws
TMP allows streamlining the range checks done, of course.
Another variation I've used is a checked, two-argument function template with each potentially having a different type:
template <class T, class U>
to_host(T & _result, U _value);
(I always prefer output parameters to be first because their position is consistent and it leaves open the possibility of defaulted parameters in the general case. I'd prefer that in Boost.Endian, but not everyone likes that convention.)
There are two possibilities for dealing with the case when T and U are different. One is that they be required to have the same, supported size and that they are both signed or both unsigned. The other is to do a runtime check that the swapped value is within the range of the target type.
I like that the aligned integer types have the longer names because they incur padding and are non-portable.
The overhead implied by the comments in the "Comment on naming" section of the Endian Integer Types page is an excellent reason for splitting the endian class template into endian_buffer and endian_integer as previously suggested. For C++11, the latter can derive from the former. For C++03, if POD-ness is important, you'll need to duplicate the former's code in the latter, but it's worthwhile.
What is your evaluation of the implementation?
There's already been a good deal of discussion on efficiency and reuse during the review, so I won't rehash it.
The 64b reorder(), for example, looks like a lot of code for an inline function. I haven't compiled it to inspect the resulting assembly to see how much or little results. My concern is for the possibility of code bloat arising from the desire to keep this a header-only library. If those were function templates, then the compiler can choose whether to inline or not, despite the definition being in a header.
The implementation of class endian is uncommented, yet non-trivial. A future maintainer, even Beman, will benefit from some helpful information inserted now.
The implementation of class endian would be simplified, it appears to me, by passing the endianness along to a detail::store() function template and letting it select the big/little endian store logic.
What is your evaluation of the documentation?
It's incomplete. There are no examples of the conversion functions. There is no tutorial. There is no guidance on selecting between the alternatives. There is no comparison with traditional htnol()-style code.
Introduction to endianness
s/modern Apple, Linux, or Windows computer/computer/ (the OS doesn't matter)
s|a[sic] Oracle/Sun Solaris computer| (the OS doesn't matter)
s/CPU's/CPUs/ (plural, not possessive)
Introduction to the Boost.Endian library
The first and last sentences should be merged: "Boost.Endian is a header only library providing facilities for managing integer endianness."
Endian conversions for native integers: The last sentence makes both absolute and relative statements. The relative statements are, presumably, relative to endian integer types, but that's not made clear. I suggest rewording as, "This approach is simple and efficient, but when compared to endian integer types, is less flexible regarding size and alignment, can be harder to manage, is more error prone because of the need to apply endian conversions in unrelated logic paths, all of which can lead to logic errors that are difficult to debug."
Endian integer types: The second sentence has a similar problem, and the last sentence could be combined to correspond with the list of the previous paragraph. Try this variation: "This approach is simple and, while it can be less efficient than endian conversions, it avoids strict alignment requirements allowing opportunities to pack data more tightly." I'd follow with this variation of the third sentence: "Furthermore, types of length from 1 to 8 bytes are supported, rather than just the usual 1, 2, 4, and 8 byte lengths."
This page is odd. The synopsis does not include links to the description of the functions. When I first saw the page, I thought there simply was no documentation for using the various functions. That documentation does appear in the latter part of the page, but it isn't obvious. There should be information explaining why conditional and unconditional functions are useful and each should refer to its counterpart. That is, reorder(uint16_t &) should note that if the intent is not to reorder the bytes regardless of the platform, that bit_to_native(uint16_t &), for example, is appropriate. The unconditional nature or reorder() escaped me until I realized that these functions always swap bytes, unlike the htonl() family, which I'd never before encountered as necessary. Thus, use cases for unconditional byte swapping are warranted.
Endian Integer Types
The Conversion Functions page links to the description of big and little endian on the Endian Home page, but the Integer Types page does not.
The example shown on the Integer Types page should be recreated using the Conversion Functions to show the difference in usage. That example should use sizeof(h) rather than sizeof(header) since that is a safer practice.
This section, while important, should not appear so soon on the page.
s/compilers do layout memory/compilers lay out memory/
s/it will be possible to specify the default constructor as trivial/the default constructor is marked as trivial/
s/will no longer disqualify/do not disqualify/
s/will no longer be relying/does not rely/
s/Big endian | little endian | native endian byte ordering./Big, little, and native endian byte ordering/
s/Signed | unsigned/Signed and unsigned/
s/Unaligned | aligned/Data may be aligned or unaligned/
s/1-8 byte (unaligned) | 2, 4, 8 byte (aligned)/2, 4, and 8 byte aligned types and 1-8 byte unaligned types/
The column order should correspond with the order of information in the typedef names: alignment, sign, endianness, size.
The discussion of the difference between aligned and unaligned types should not be "buried" in the Typedefs section.
Comment on naming
s/these type grows/these types grows/
Why does "the realization creep in" that the endian integer types are lousy arithmetic integers? Only profiling would reveal that. Consequently, you should call that out, perhaps in a "Design Details" section along with the aligned/unaligned discussion, thereby explicitly stating rather than leaving to each library user the discovery of the overhead of these types.
The link from "cover_operators" to a header is surprising.
The description of the default constructor fails to indicate whether the value is left uninitialized, which I presume to be the case.
The description of the conversion operator mixes T and value_type. While those two are the same, one has to examine the class definition to see that. Either s/T/value_type/ or s/value_type/T/.
s/and both speed and/plus both speed and/
s/disk records/records/ (applies to networking, too)
s/Yes, for sure, compared/There is overhead compared/
s/are usually be faster/are usually faster/
s/types POD's?/types POD?/
s/endian integer types not being POD's/of endian types not being POD/
s/data files are portable/data files that are portable/
The "These types are really just byte-holders" entry should not be a FAQ. That information, along with a discussion of the overhead of the simple syntax, should be part of the "Design Details" section suggested above.
Design considerations for Boost.Endian integers
Only "Design" is underlined in the section heading.
s/used very successful/used very successfully/
s/This is ensures that , and/This ensures that class endian is POD, even in C++03, and/
What is your evaluation of the potential usefulness of the library?
Highly useful, but could be more so.
Did you try to use the library?
How much effort did you put into your evaluation?
I spent about an hour reading the documentation plus a lot of time on the mailing list participating in endian discussions.
Are you knowledgeable about the problem domain?
Yes. I've used the old standby functions, htonl(), ntohl(), and friends over many years. I've created template functions to automatically select the right function based upon the size of the type being swapped, etc.
Do you think the library should be accepted as a Boost library?
There are issues that keep me from saying yes at this time. There are too many suggested variations and ideas under consideration to accept the library in its present state. However, a mini-review should be sufficient to evaluate the final form, once Beman determines a course of action, and determine whether to accept it or not.
Rob Stewart robert.stewart_at_[hidden]
Software Engineer using std::disclaimer;
Dev Tools & Components
Susquehanna International Group, LLP http://www.sig.com
IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk