|
Boost : |
Subject: Re: [boost] [proposal] raw move (was: [interest] underlying type library)
From: Julian Gonggrijp (j.gonggrijp_at_[hidden])
Date: 2011-08-24 07:00:59
Gottlob Frege wrote:
> On Tue, Aug 23, 2011 at 11:12 PM, Gottlob Frege <gottlobfrege_at_[hidden]> wrote:
>>
>> raw_move assumes that later the algorithm will raw_move back into the
>> temporarily invalid source object. So sooner or later we write to
>> that memory. Depending on caching scenarios, it might actually be
>> faster to write to that memory *sooner*.
We can write sooner, but that doesn't change anything to our need to
write later, does it? So either we write soon with zeros and later
with the new value, or we write only later with the new value.
Or are you saying that the compiler might be able to predict what
value we're going to move later and move that value sooner?
Or was your point maybe that reading the source object of a raw move
might not be enough to promote it in the cache, and that writing it
with zeros instead will therefore improve the speed of retrieval when
the object has become the target of a new raw move?
>> [...]
>> But I wouldn't be that surprised if raw_move offered no speed up in most cases.
To be honest, in the meanwhile I wouldn't be surprised about that
anymore either. Christopher Jefferson already argued quite
convincingly that there might be very little to gain. I'm still going
to write a benchmark, but I'm prepared for the possibility that it
will be no more than a benchmarking exercise.
> Particularly if you are still calling raw_move one container element at a time.
> If you really want to speed things up, you need to memcpy a whole
> block of objects at once. ie an array/vector of pods.
> If 100 elements is 100 memcpy calls, I'm not sure you will get much benefit.
> If we can get 100 elements to compile into 1 memcpy call, there is a
> chance at a speed up. A good memcpy is hand optimized for the given
> architecture to prime the cache as it moves along, etc. That only
> works if it is one big call.
Yes, I agree that this is a much more powerful way to speed up a
program.
-Julian
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk