|
Boost : |
From: Alexander Terekhov (terekhov_at_[hidden])
Date: 2005-04-06 12:15:58
Peter Dimov wrote:
>
> Alexander Terekhov wrote:
> > and
> >
> > asm long atomic_decrement_strong( register long * pw ) {
[... loop -> loop1+loop2 ...]
> but it's either suboptimal (more than one sync) or incorrect (missing sync),
> I think. It needs a state machine.
And how is this
asm long atomic_decrement_strong( register long * pw ) {
<load-reserved>
<add -1>
<branch if zero to acquire>
{lw}sync
loop1:
<store-conditional>
<branch if !failed to done>
loop2:
<load-reserved>
<add -1>
<branch if !zero to loop1>
acquire:
<store-conditional>
<branch if failed to loop2>
isync
done:
<...>
}
incorrrect or suboptimal?
>
> loop0:
>
> lwarx
> add -1
> beq acquire-without-sync
>
> sync
>
> loop1:
>
> stwcx.
> beq+ done
>
> loop2:
>
> lwarx
> add -1
> bne loop1
>
> acquire-with-sync:
>
> stwcx.
> bne- loop2
> isync
> blr
>
> acquire-without-sync:
>
> stwcx.
> bne- loop0
> isync
>
> done:
>
> blr
I must be missing something, but it looks to me that you have
way too much branching and isync-ing.
regards,
alexander.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk