|
Boost : |
From: Howard Hinnant (howard.hinnant_at_[hidden])
Date: 2006-03-16 15:46:27
On Mar 13, 2006, at 6:08 PM, Howard Hinnant wrote:
> On Mar 13, 2006, at 6:00 PM, Peter Dimov wrote:
>
>>> I'm seeing (only with inlining/optimizations on) that the reference
>>> count is not properly incremented. This is with a single-threaded
>>> test case. I haven't fully understood the code differences yet, but
>>> it appears that code is being reordered across these assembly blocks
>>> in such a way as to change the logic. I'm not familiar enough with
>>> gcc assembly semantics to know if the volatile attribute (on the
>>> assembly block itself, not any data) is required to make inline
>>> assembly behave itself.
>>>
>>> Again, this is not a multithread issue that I can see, as the
>>> failures I'm seeing are in single threaded code.
>>
>> There is a known bug in 4.0:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21528
>>
>> that can have this effect, but it apparently has been fixed in
>> 4.0.1. There
>> might be others that the boost test doesn't uncover.
>>
>> I think that the compiler is not supposed to reorder across __asm__
>> blocks
>> in a way that could change the logic and if such reordering
>> happens it
>> should be reported as a bug against g++.
>
> That was very kind of you to look that up for me, thanks. Since my
> last posting I've also become convinced that this is an optimization
> bug in gcc. It appears that exactly one call to atomic_increment has
> been optimized away. I'll investigate further whether 21528 is the
> cause/fix. Thanks much.
I'm back.... ;-)
My real problem is that I'm a neophyte in the bewildering world of
gcc asm syntax. I've been studying:
http://gcc.gnu.org/onlinedocs/gcc-4.0.3/gcc/Extended-
Asm.html#Extended-Asm
on the good advice of some colleagues. The other advice I'm getting
is that there is still a bug in:
boost/detail/sp_counted_base_gcc_ppc.hpp
Looking at just atomic_increment:
inline void atomic_increment( int * pw )
{
int tmp;
__asm__
(
"0:\n\t"
"lwarx %1, 0, %2\n\t"
"addi %1, %1, 1\n\t"
"stwcx. %1, 0, %2\n\t"
"bne- 0b":
"=m"( *pw ), "=&b"( tmp ):
"r"( pw ):
"cc"
);
}
My current understanding is that the "=m" constraint indicates that
*pw is write-only, and should be changed to "+m" to indicate read/
write (to memory). Indeed when I make this change, my test case
clears right up. It also appears that atomic_decrement and
atomic_conditional_increment could use this treatment as well.
Looking forward to further discussion on this...
-Howard
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk