Re: [boost] [XInt] CoW/Move Timings

10 Mar 2011


      On Wed, 09 Mar 2011 23:30:21 -0800
"Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:
...
On 3/9/2011 9:35 PM, Chad Nelson wrote:
...
On Wed, 09 Mar 2011 17:36:40 -0800
Steven Watanabe<watanabesj@gmail.com>  wrote:
...
...
The "review code" lines are for the unmodified review version of
the code. All others are for the pass-by-reference version, with
several other modifications that should have no measurable effect
on the speed.
Is this version of the code available somewhere?
<http://www.oakcircle.com/download/xint-postreview1.zip>
It doesn't look like any of the cpp files in libs/xint/test
correspond to the benchmark you posted most recently.  Am I blind, or
should I be looking elsewhere?
I'm not sure what you mean. I simply defined PERFORMANCE_TEST
in /libs/xint/test/test_main.cpp and used the Linux "time" command to
get the timings for each run.
...
I'm stepping through an execution of test_main.cpp with "#define 
PERFORMANCE_TEST" uncommented in the MSVC9 debugger and with an added 
"#define BOOST_XINT_USE_MOVE" (why isn't this defined by default???).
Because Boost.Move was still very much up in the air when I wrote that
code.
...
First thing I notice is that, indeed, CoW is used internally in the 
raw_integer_t -> magnitude_manager_t, unless one undefs 
BOOST_XINT_USE_COW
As I explained earlier.
...
(in which case, there are still reference counts floating around,
sadly; but it looks like every attachment is immediately followed by
a deep copy).
That was the only way I could see to provably disable it for timing
purposes, without rewriting large portions of the code.
...
However, it seems some operations still assume CoW behavior of
raw_integer_t.
As I pointed out in my original timings message: "Note that the
Boost.Move numbers could be a little better. The integer_t classes are
move-enabled, but the internal data class is not as yet. The data class
only sometimes uses return values, so the difference isn't likely to be
great."
...
Did you undef BOOST_XINT_USE_COW in your benchmarking of sans-CoW?
Yes. That was the point of putting BOOST_XINT_USE_COW in there. It
wasn't there in the review version of the code.
...
Even with CoW, there are some tweaks to your general approach that I 
think can speed things up a bit.  Sticking to the execution I'm 
following in operator+ above, n1 and n2 each had length 64, so n1 is 
first resized to 65 (triggering an allocation), n1's data is copied
into the new storage, and n2's data is then added to n1's data.  Thus,
we're making an extra unnecessary pass over n1's data.  Probably
better to take both parameters by reference-to-const, allocate an
uninitialized block of max(n1.length, n2.length) + 1, and then run
over all 3 arrays once.
Good catch.
...
What I'm really finding troubling, though, is the deep copy performed
at line 2119 in integer.hpp when using thread-safe integer_t's:
integer_t<BOOST_XINT_APARAMS> r(n1._data() + n2._data());
n1._data() + n2._data() executes the BOOST_XINT_RAWINT overload of 
operator+, and this object is *not* moved into r, but deep copied
(!), since the temporary raw_integer_t object makes the reference
count 2 once the r object attaches to it.  If my analysis is wrong on
this, please correct me, but I'm guessing this isn't what was
intended, right?
That's not what was intended for the original code. The
deep-copy-everything behavior is, as mentioned above, a hack that I
added in the last few days to provably and completely disable
copy-on-write.
...
The move constructor (invoked in the following line) seems to work as 
expected.  By the way, if you're going to implement move construction
in terms of swap, I'm told you should follow the swap with a clear of
the moved-from object.  Since a temporary exists all the way "'til
the semicolon", which could be a long time in a real execution, this
ensures that unused resources are never held by a temporary longer
than necessary.
Noted.
...
That one deep copy, observed in a very short period of time stepping 
through the code, doesn't give me a lot of confidence in your non-CoW 
timings, as it's probably not unlikely there are multiple instances
of this same problem.  I hope this convinces you of the importance of 
tracking the number of allocations.
Further, your non-CoW timings are somewhat poisoned anyway by the 
persistent presence of reference counting, which is just dead weight 
adding to the timings.  I think until we get a stripped down integer 
class that just consists of a std::vector magnitude and a bool sign,
it will be hard to trust any comparison between CoW and non-CoW.  And
it will be more convenient to code this up if the arithmetic
algorithms can automatically work with a std::vector of digits ;)
Then there's no way to prove whether CoW is useful or not until I
rewrite the entire code to remove it anyway.
-- 
Chad Nelson
Oak Circle Software, Inc.
*
*
*