Boost logo

Boost :

Subject: Re: [boost] [endian] Review
From: Beman Dawes (bdawes_at_[hidden])
Date: 2011-09-14 13:32:32


On Tue, Sep 13, 2011 at 3:18 PM, Stewart, Robert <Robert.Stewart_at_[hidden]> wrote:
> Herewith is my review of Beman's Boost.Endian library.
>
> ____________________
> ____________________
> What is your evaluation of the design?
>
> _____
> The two-argument overloads of the [un]conditional swapping functions seem awkward at first
> blush, but are really the best interface for those operations.  Changing to value returning
> functions opens the door to mistakes like the following:
>
>   uint32_t const in(123);
>   int32_t const out(reorder(in));

Wow, you have a devious mind:-) I had to read that three times before
I saw the problem!

> There might be a warning, but many programmers routinely ignore such.

Documentation can help a bit. But there are enough different concerns
about the conversion interfaces I'm not sure we can address them all.

> As is, the compiler will complain about the conditional functions:
>
>   uint32_t const in(123);
>   int32_t out;
>   native_to_big(in, out); // Error: ambiguous
>
> The unconditional functions do not provide that benefit because they aren't function templates:
>
>   uint32_t const in(123);
>   int32_t out;
>   reorder(in, out);
>
> _____
> An interesting variation that I've used in the past is a set of cast-style function templates that permit byte swapping one type and returning the result as a different type, after verifying that the resulting value is in range of the target type.  Thus,
>
>   uint32_t const in(std::numeric_limits<uint32_t>::max());
>   int32_t const out(to_host<int32_t>(in)); // throws
>
> TMP allows streamlining the range checks done, of course.
>
> _____
> Another variation I've used is a checked, two-argument function template with each potentially having a different type:
>
>   template <class T, class U>
>   void
>   to_host(T & _result, U _value);
>
> (I always prefer output parameters to be first because their position is consistent and it leaves open the possibility of defaulted parameters in the general case.  I'd prefer that in Boost.Endian, but not everyone likes that convention.)
>
> There are two possibilities for dealing with the case when T and U are different.  One is that they be required to have the same, supported size and that they are both signed or both unsigned.  The other is to do a runtime check that the swapped value is within the range of the target type.

It isn't clear to me that the conversion library is the right place to
do range checks. A major point of using endian conversion functions
rather than endian integers is a need to perform arithmetic and
related operations on native types. Doesn't the same apply to range
checks?

An argument to do range checking on the implicit conversions in the
endian integer types might be stronger, but even there that feels more
like something that should go in a separate checked integer library.

> _____
> I like that the aligned integer types have the longer names because they incur padding and are non-portable.
>
> _____
> The overhead implied by the comments in the "Comment on naming" section of the Endian Integer Types page is an excellent reason for splitting the endian class template into endian_buffer and endian_integer as previously suggested.  For C++11, the latter can derive from the former.  For C++03, if POD-ness is important, you'll need to duplicate the former's code in the latter, but it's worthwhile.

I'm not sure C++03 POD-ness is critical. AFAIK, the only impact would
be use in unions. So maybe in C++03, endian_buffers but not
endian_integers would be usable in POD's.

One critical issue is improving the documentation, including fully
functional (and tested) examples that demonstrate programming the
common use cases using (1) conversions only, (2) endian_buffers, and
(3) endian_integers.

> ____________________
> ____________________
> What is your evaluation of the implementation?
>
> There's already been a good deal of discussion on efficiency and reuse during the review, so I won't rehash it.
>
> The 64b reorder(), for example, looks like a lot of code for an inline function.  I haven't compiled it to inspect the resulting assembly to see how much or little results.  My concern is for the possibility of code bloat arising from the desire to keep this a header-only library.  If those were function templates, then the compiler can choose whether to inline or not, despite the definition being in a header.

I'll give it some thought. What I'd really like to develop is a decent
recipe for being able to support either header-only or
compiled-library approaches.

> The implementation of class endian is uncommented, yet non-trivial.  A future maintainer, even Beman, will benefit from some helpful information inserted now.

Point taken.

> The implementation of class endian would be simplified, it appears to me, by passing the endianness along to a detail::store() function template and letting it select the big/little endian store logic.

The same might applied to a load() function template?

There have also been requests to build the implementation on top of
the conversion functions. That implies that the conversion functions
are expanded to cover unaligned cases, etc.

For me, the key thing that needs to come out of this review is a
complete public interfaced. The details of the implementation will get
looked at, of course, but the public interface needs to be cast in
stone.

>
> ____________________
> ____________________
> What is your evaluation of the documentation?
>
> It's incomplete.  There are no examples of the conversion functions.  There is no tutorial.  There is no guidance on selecting between the alternatives.  There is no comparison with traditional htnol()-style code.

Agree. I was planning to do that before the review, but between
evacuating before Irene, storm cleanup afterward, and a week of
repeated power failures it just didn't happen.

>...

I'm skipping the details of spelling and grammatical suggestions. They
are appreciated and will get applied when the docs are updated.

>Thus, use cases for unconditional byte swapping are warranted.

I've used unconditional swapping in a control path that only occurs
when a swap is needed. For example, when doing file conversions the
format of the current system isn't the determining factor. Rather,
whether or not the input file ordering is the same as the ordering
requested for the output file determines if a reorder is needed, and
that isn't known until run time. The conditional reorder functions are
specified in terms of the unconditional functions, too.

> What is your evaluation of the potential usefulness of the library?
>
> Highly useful, but could be more so.
>
> ____________________
> ____________________
> Did you try to use the library?
>
> No.  Sorry.
>
> ____________________
> ____________________
> How much effort did you put into your evaluation?
>
> I spent about an hour reading the documentation plus a lot of time on the mailing list participating in endian discussions.
>
> ____________________
> ____________________
> Are you knowledgeable about the problem domain?
>
> Yes.  I've used the old standby functions, htonl(), ntohl(), and friends over many years.  I've created template functions to automatically select the right function based upon the size of the type being swapped, etc.
>
> ____________________
> ____________________
> Do you think the library should be accepted as a Boost library?
>
> There are issues that keep me from saying yes at this time.  There are too many suggested variations and ideas under consideration to accept the library in its present state.  However, a mini-review should be sufficient to evaluate the final form, once Beman determines a course of action, and determine whether to accept it or not.

Understood. I'm already committed to a min-review.

Thanks for all the detailed comments. Much appreciated!

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk