Boost logo

Boost :

Subject: Re: [boost] [boost::endian] Request for comments/interest
From: Stewart, Robert (Robert.Stewart_at_[hidden])
Date: 2010-06-03 07:08:45


Terry Golubiewski wrote:

I have some problems with your tests.

> ===== CONVERT IN PLACE ========
>
> The benchmark program generates an homogeneous data file with
> 2^20 4-byte unsigned, big-endian integers.
> The array is read into memory as one big blob, and then is
> converted in place to machine-endian (little).
> The result is memcmp'ed to verify the expected result.
> The reading-in, converting, and verifying was repeated 1000 times.
>
> Swap Based: 9 seconds
> Type Based: 13 seconds
>
> When the disk-data-file was in little endian, both approaches
> came in at around 6 seconds.
>
> --- swap-based ---
> for (int trial=0; trial != 1000; ++trial) {
> {
> ifstream input("array.dat", ios::binary);
> input.read(reinterpret_cast<char*>(&array2), sizeof(array2));

Reading the data into memory shouldn't be part of the timed code. If you apply the swap-in-place logic an odd number of times, the result will be host order.

> swap_in_place<big_to_machine>(array2.begin(), array2.end());
> }
> assert(memcmp(&array1, &array2, sizeof(array_type)) == 0);
> }
>
> --- type based ---
>
> for (int trial=0; trial != 1000; ++trial) {
> {
> ifstream input("array.dat", ios::binary);
> input.read(reinterpret_cast<char*>(&array2), sizeof(array2));

As above.

> disk_array& src = reinterpret_cast<disk_array&>(array2);
> interface::copy(src.begin(), src.end(), array2.begin());

Two lines instead of one. I find this less desirable. I imagine you could make your copy function more helpful.

> }
> assert(memcmp(&array1, &array2, sizeof(array_type)) == 0);
> }
>
> ======== CONVERT & COPY =========
>
> The benchmark program still generates the same big,
> homogeneous data file.
> The array is still read into memory as one big blob.
> But this time, the conversion is copied to another array,
> i.e. not in place.
> The result is still memcmp'ed.
> Still repeated 1000 times.
>
> Swap Based: 18 seconds
> Type Based 14 seconds
>
> When the disk-data-file was in in little endian format both
> programs took about 9 seconds.
>
> --- swap based ---
> for (int trial=0; trial != 1000; ++trial) {
> {
> ifstream input("array.dat", ios::binary);
> input.read(reinterpret_cast<char*>(&tmp_array),
> sizeof(tmp_array));
> array_type::const_iterator src = tmp_array.begin();
> array_type::const_iterator end = tmp_array.end();
> array_type::iterator dst = array2.begin();
> for ( ; src != end; ++src, ++dst)
> *dst = swap<little_to_machine>(*src);

s/little_to_machine/big_to_machine/?

> }
> assert(memcmp(&array1, &array2, sizeof(array_type)) == 0);
> }
>
> --- typed based ---
> for (int trial=0; trial != 1000; ++trial) {
> {
> ifstream input("array.dat", ios::binary);
> input.read(reinterpret_cast<char*>(&tmp_array),
> sizeof(tmp_array));
> interface::copy(tmp_array.begin(), tmp_array.end(),
> array2.begin());

Does this actually swap anything? Doesn't this just copy the data to unswapped objects that *would* swap on access? That's hardly a fair comparison.

> }
> assert(memcmp(&array1, &array2, sizeof(array_type)) == 0);
> }

I don't see any code reading the resulting values. That unfairly taints the tests in favor of the object-based approach. If the underlying data is big endian, then the object-based approach implies swapping on every access to the data. Reading each value once would be the optimal use case for the object-based approach. Reading multiple times would clearly favor the function-based approach.

_____
Rob Stewart robert.stewart_at_[hidden]
Software Engineer, Core Software using std::disclaimer;
Susquehanna International Group, LLP http://www.sig.com

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk