Boost logo

Threads-Devel :

From: Anthony Williams (anthony_at_[hidden])
Date: 2008-05-04 18:45:42


Quoting Dmitriy Vyukov <dvyukov_at_[hidden]>:

>
>> From: Anthony Williams <anthony_at_[hidden]>

>> Thanks for the suggestion. Actually, I've been
>> intending to change the implementation to use a BitTestAndSet
>> instruction where it can, so try_lock becomes:
>>
>> return !bit_test_and_set(&active_count,lock_flag_bit);
>>
>> But even then, it might be worth adding a simple non-interlocked read
>> before-hand to check for the flag, and only do the BTS if it's not set.
>>

I've checked in the BTS-based code, and would be grateful if you could
have a look.

> Well, this complicates situation a bit.
>
> This version (1):
> return !bit_test_and_set(&active_count,lock_flag_bit);
> has very good property that it requests cache-line in modified state
> instantly.
>
> And this version (2):
> if (! (active_count &lock_flag_bit )
> return false;
> return !bit_test_and_set(&active_count,lock_flag_bit);
> requests cache-line in shared state, and only after that in modified state.
>
> If you are targeting at try_lock() success then version (1) is better.
> If you are targeting at try_lock failure then version (2) is better.
> It's reasonable to target to success because it's "uncontented case".
> But it's also reasonable to make failed try_lock() as lightweight as
> possible... I'm not sure which version is better at the end...
>
>
> I've seen 3x scalability degradation under high-load on quad-core
> between following versions:
>
> 1:
> return 0 == XCHG(&var, 1); // request cache-line in modified state
>
> 2:
> int local = var; // request cache-line in shared state
> return local == CAS(&var, local, local+1); // request cache-line in
> modified state
>
> My understanding is that reason for scalability degradation is exactly
> cache coherence traffic.

Yes: two accesses implies up to two cache-line transfers. If another
CPU/core modified the value in between, the cache-line has to bounce
to the other CPU and back.

Have you got access to a quad-core or true multiprocessor machine for testing?

Anthony

-- 
Anthony Williams            | Just Software Solutions Ltd
Custom Software Development | http://www.justsoftwaresolutions.co.uk
Registered in England, Company Number 5478976.
Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL

Threads-Devel list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk