|
Boost : |
Subject: Re: [boost] [XInt] CoW/Move Timings
From: Chad Nelson (chad.thecomfychair_at_[hidden])
Date: 2011-03-10 09:48:56
On Wed, 09 Mar 2011 23:30:21 -0800
"Jeffrey Lee Hellrung, Jr." <jhellrung_at_[hidden]> wrote:
>On 3/9/2011 9:35 PM, Chad Nelson wrote:
>> On Wed, 09 Mar 2011 17:36:40 -0800
>> Steven Watanabe<watanabesj_at_[hidden]> wrote:
>>
>>>> The "review code" lines are for the unmodified review version of
>>>> the code. All others are for the pass-by-reference version, with
>>>> several other modifications that should have no measurable effect
>>>> on the speed.
>>>
>>> Is this version of the code available somewhere?
>>
>> <http://www.oakcircle.com/download/xint-postreview1.zip>
>
> It doesn't look like any of the cpp files in libs/xint/test
> correspond to the benchmark you posted most recently. Am I blind, or
> should I be looking elsewhere?
I'm not sure what you mean. I simply defined PERFORMANCE_TEST
in /libs/xint/test/test_main.cpp and used the Linux "time" command to
get the timings for each run.
> I'm stepping through an execution of test_main.cpp with "#define
> PERFORMANCE_TEST" uncommented in the MSVC9 debugger and with an added
> "#define BOOST_XINT_USE_MOVE" (why isn't this defined by default???).
Because Boost.Move was still very much up in the air when I wrote that
code.
> First thing I notice is that, indeed, CoW is used internally in the
> raw_integer_t -> magnitude_manager_t, unless one undefs
> BOOST_XINT_USE_COW
As I explained earlier.
> (in which case, there are still reference counts floating around,
> sadly; but it looks like every attachment is immediately followed by
> a deep copy).
That was the only way I could see to provably disable it for timing
purposes, without rewriting large portions of the code.
> However, it seems some operations still assume CoW behavior of
> raw_integer_t.
As I pointed out in my original timings message: "Note that the
Boost.Move numbers could be a little better. The integer_t classes are
move-enabled, but the internal data class is not as yet. The data class
only sometimes uses return values, so the difference isn't likely to be
great."
> Did you undef BOOST_XINT_USE_COW in your benchmarking of sans-CoW?
Yes. That was the point of putting BOOST_XINT_USE_COW in there. It
wasn't there in the review version of the code.
> Even with CoW, there are some tweaks to your general approach that I
> think can speed things up a bit. Sticking to the execution I'm
> following in operator+ above, n1 and n2 each had length 64, so n1 is
> first resized to 65 (triggering an allocation), n1's data is copied
> into the new storage, and n2's data is then added to n1's data. Thus,
> we're making an extra unnecessary pass over n1's data. Probably
> better to take both parameters by reference-to-const, allocate an
> uninitialized block of max(n1.length, n2.length) + 1, and then run
> over all 3 arrays once.
Good catch.
> What I'm really finding troubling, though, is the deep copy performed
> at line 2119 in integer.hpp when using thread-safe integer_t's:
>
> integer_t<BOOST_XINT_APARAMS> r(n1._data() + n2._data());
>
> n1._data() + n2._data() executes the BOOST_XINT_RAWINT overload of
> operator+, and this object is *not* moved into r, but deep copied
> (!), since the temporary raw_integer_t object makes the reference
> count 2 once the r object attaches to it. If my analysis is wrong on
> this, please correct me, but I'm guessing this isn't what was
> intended, right?
That's not what was intended for the original code. The
deep-copy-everything behavior is, as mentioned above, a hack that I
added in the last few days to provably and completely disable
copy-on-write.
> The move constructor (invoked in the following line) seems to work as
> expected. By the way, if you're going to implement move construction
> in terms of swap, I'm told you should follow the swap with a clear of
> the moved-from object. Since a temporary exists all the way "'til
> the semicolon", which could be a long time in a real execution, this
> ensures that unused resources are never held by a temporary longer
> than necessary.
Noted.
> That one deep copy, observed in a very short period of time stepping
> through the code, doesn't give me a lot of confidence in your non-CoW
> timings, as it's probably not unlikely there are multiple instances
> of this same problem. I hope this convinces you of the importance of
> tracking the number of allocations.
>
> Further, your non-CoW timings are somewhat poisoned anyway by the
> persistent presence of reference counting, which is just dead weight
> adding to the timings. I think until we get a stripped down integer
> class that just consists of a std::vector magnitude and a bool sign,
> it will be hard to trust any comparison between CoW and non-CoW. And
> it will be more convenient to code this up if the arithmetic
> algorithms can automatically work with a std::vector of digits ;)
Then there's no way to prove whether CoW is useful or not until I
rewrite the entire code to remove it anyway.
-- Chad Nelson Oak Circle Software, Inc. * * *
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk