Boost logo

Boost :

Subject: Re: [boost] [XInt] CoW/Move Timings
From: Jeffrey Lee Hellrung, Jr. (jhellrung_at_[hidden])
Date: 2011-03-10 02:30:21


On 3/9/2011 9:35 PM, Chad Nelson wrote:
> On Wed, 09 Mar 2011 17:36:40 -0800
> Steven Watanabe<watanabesj_at_[hidden]> wrote:
>
>>> The "review code" lines are for the unmodified review version of the
>>> code. All others are for the pass-by-reference version, with several
>>> other modifications that should have no measurable effect on the
>>> speed.
>>
>> Is this version of the code available somewhere?
>
> <http://www.oakcircle.com/download/xint-postreview1.zip>

It doesn't look like any of the cpp files in libs/xint/test correspond
to the benchmark you posted most recently. Am I blind, or should I be
looking elsewhere?

I'm stepping through an execution of test_main.cpp with "#define
PERFORMANCE_TEST" uncommented in the MSVC9 debugger and with an added
"#define BOOST_XINT_USE_MOVE" (why isn't this defined by default???).

First thing I notice is that, indeed, CoW is used internally in the
raw_integer_t -> magnitude_manager_t, unless one undefs
BOOST_XINT_USE_COW (in which case, there are still reference counts
floating around, sadly; but it looks like every attachment is
immediately followed by a deep copy). However, it seems some operations
still assume CoW behavior of raw_integer_t. For example, the operator+
for BOOST_XINT_RAWINT is defined as

BOOST_XINT_RAWINT_TPL
BOOST_XINT_RAWINT operator+(BOOST_XINT_RAWINT n1, const
BOOST_XINT_RAWINT &n2) {
     n1 += n2;
     return n1;
}

which, when CoW is disabled (by this, I mean BOOST_XINT_USE_COW is *not*
defined), could trigger an additional unnecessary allocation, due to the
sum n1 + n2 having a longer length than n1's capacity.

Did you undef BOOST_XINT_USE_COW in your benchmarking of sans-CoW?

Even with CoW, there are some tweaks to your general approach that I
think can speed things up a bit. Sticking to the execution I'm
following in operator+ above, n1 and n2 each had length 64, so n1 is
first resized to 65 (triggering an allocation), n1's data is copied into
the new storage, and n2's data is then added to n1's data. Thus, we're
making an extra unnecessary pass over n1's data. Probably better to
take both parameters by reference-to-const, allocate an uninitialized
block of max(n1.length, n2.length) + 1, and then run over all 3 arrays once.

What I'm really finding troubling, though, is the deep copy performed at
line 2119 in integer.hpp when using thread-safe integer_t's:

         integer_t<BOOST_XINT_APARAMS> r(n1._data() + n2._data());

n1._data() + n2._data() executes the BOOST_XINT_RAWINT overload of
operator+, and this object is *not* moved into r, but deep copied (!),
since the temporary raw_integer_t object makes the reference count 2
once the r object attaches to it. If my analysis is wrong on this,
please correct me, but I'm guessing this isn't what was intended, right?

The move constructor (invoked in the following line) seems to work as
expected. By the way, if you're going to implement move construction in
terms of swap, I'm told you should follow the swap with a clear of the
moved-from object. Since a temporary exists all the way "'til the
semicolon", which could be a long time in a real execution, this ensures
that unused resources are never held by a temporary longer than necessary.

That one deep copy, observed in a very short period of time stepping
through the code, doesn't give me a lot of confidence in your non-CoW
timings, as it's probably not unlikely there are multiple instances of
this same problem. I hope this convinces you of the importance of
tracking the number of allocations.

Further, your non-CoW timings are somewhat poisoned anyway by the
persistent presence of reference counting, which is just dead weight
adding to the timings. I think until we get a stripped down integer
class that just consists of a std::vector magnitude and a bool sign, it
will be hard to trust any comparison between CoW and non-CoW. And it
will be more convenient to code this up if the arithmetic algorithms can
automatically work with a std::vector of digits ;)

- Jeff


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk