Boost logo

Boost Users :

From: Gottlob Frege (gottlobfrege_at_[hidden])
Date: 2006-05-06 03:27:41


> Date: Fri, 5 May 2006 14:09:33 -0600
> From: Ray Whitmer <ray_at_[hidden]>
> Subject: Re: [Boost-users] Thread-safe smart pointers, hashtables
> without locking
> To: boost-users_at_[hidden]
> Message-ID: <0D5E3BD4-9A5D-4CE2-9040-35500773C190_at_[hidden]>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>
> > The algorithm makes sense, and if all these claims hold then it
> > should work, I think. I'm still concerned about step (2), however,
> > if only because of the assumption that what you write as a load/
> > store in C++ maps to a load/store in ML. Will it? Always? I ask as
> > a non-rhetorical question: does the C++ standard *guarantee* it?
> > Because if it doesn't then it seems like a potential problem, but
> > maybe I'm being too careful.

I'd also say the idea of moving a pointer out of the way, and then using the
flip-gate-count-thing to wait for any 'potentially-using' readers to finish,
definitely has 'potential'. You would need to show detailed code to really
be sure. And not to us - to the guys on comp.programming.threads.

If I could answer in the affirmative, I could prove it works and not
> have to talk to anyone. I can make a bunch of lesser claims, that of
> course fail to add up to the strong claim but make me feel better:
>
> 3. I have looked at the output of many C compilers and have never
> seen a pointer assignment take multiple instructions. The exception
> might be if someone used a poorly-written memcpy-like operations to
> transfer the contents of a structure byte by byte. A well-written
> one takes the respective alignments into account and transfers
> aligned 32-bit quantities one-by-one. We are talking about code in a
> controlled Hashtable class, it would have to be the "C" compiler
> choosing to transfer a class consisting of a single address field
> using something other than a single instruction. It doesn't sound
> logical to me, but it wouldn't surprise me completely to see it
> happen, but I would report it as a compiler bug and expect it to be
> addressed even though it is not in the standard.

As long as the address of the pointer is properly aligned, it will be
atomic. The language doesn't guarantee that (the language doesn't talk
about atomic or threads at all - at least not yet), but the processors +
compilers tend to guarantee it. HOWEVER! *even if it is atomic*, on a
multi-cpu machine each cpu might see memory differently - you need memory
barriers to ensure the other cpus see your new pointer value correctly.
Think of cpus and memory like harddrive controllers that reorder reads and
writes to lessen seeks - the cpu does the same sort of thing with memory -
and in doing so, it may reorder the write of your pointer to 'happen' before
you expected (because it was near other memory it was writing) and/or read
the pointer later than expected (because it optimized other reads into a
'bundle' before the read of your pointer).

Anyhow, if you haven't already, take a look at memory barriers, and, as a
canonical example, the broken 'double checked locking pattern'. You can't
do lock free without understanding this stuff.

4. These issues have to be addressed with the multi-processor cores
> becoming more prevalent because they are basic to lock-free
> processing, or can C++ accept significant inferiority to Java with
> respect to lock-free multiprocessing?

see above. Much of it doesn't come for free.

For what it is worth, if anyone else thinks the solution is worth a
> generic contribution, or is of the opposite opinion that it would not
> be generic enough, I would be interested in hearing. I suppose this
> sort of question should go to the developers list. I think it would
> be much more difficult to implement a table of shared_ptr than it is
> for intrusive_ptr, because assigning a shared_ptr is much less likely
> to be uninterrupted, leaving other threads with a broken visible
> state. There is clearly more detail to discuss as well.
> Ray Whitmer

Your wording of 'much less likely' is scary, and I think similar wording
made James apprehensive as well. There is no grey - it is black or white,
thread safe or not. And you are correct, you can't make { reading the
location of a shared_ptr + incrementing its ref-count } atomic, it least not
on most processors.

I'm not sure of the exact implementation of intrusive_ptr, but assuming it
is like most, you can make

    ip1 = ip2;

atomic IF ip2 is 'stable', ie guaranteed to not be changing during the
assignment. ie ip1 can be read and/or written by other threads "while" the
assignment is happening, and you will either get the before state of ip1, or
the after state.

I don't think the boost implementation of intrusive_ptr's operator=() is
atomic, but it could be made to be.

Tony



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net