|
Boost : |
From: Anthony Williams (anthony_w.geo_at_[hidden])
Date: 2006-10-27 11:15:01
"Peter Dimov" <pdimov_at_[hidden]> writes:
> Peter Dimov wrote:
>> Anthony Williams wrote:
>>> "Peter Dimov" <pdimov_at_[hidden]> writes:
>>>
>>>> Peter Dimov wrote:
>>>>> Howard Hinnant wrote:
>>>>
>>>>>> So the fast path would check these bits and avoid the notify_all
>>>>>> if they weren't set.
>>>>>
>>>>> Thanks, I'll try this tomorrow. Later today, I mean.
>>>>
>>>> Works splendidly on uniprocessor, but displays the same odd behavior
>>>> I've been seeing with the other algorithms on my Pentium D (with
>>>> XP64):
>>>>
>>>> 0R+4W+0F: 46
>>>> 0R+0W+4F: 14
>>>> 0R+4W+4F: 10 (!)
>>>>
>>>> The Windows scheduler doesn't like us. Or I've made a mistake
>>>> somewhere. Even if I didn't, it seems to me that the bit doesn't
>>>> help in the 4W case since there's always a writer waiting, so we hit
>>>> the notify_all.
>>>
>>> Which mutex/condition implementation are you using?
>>> RC-1.34/HEAD/1.33.1 or thread_rewrite?
>>
>> HEAD, but the phenomenon above is not synchronization
>> primitive-specific.
>> I'm seeing it with semaphore-based implementations as well.
>
> Your last algorithm also suffers from it, not as much, though:
>
> 0R+4W+0F: 26
> 0R+0W+4F: 13
> 0R+4W+4F: 11
>
> My "naive" algorithm (slightly enhanced to use an event instead of mutex+cv
> for waiting) gave
>
> 0R+4W+0F: 14
> 0R+0W+4F: 14
> 0R+4W+4F: 11
>
> which I explain by it being "honest" to the kernel and using a mutex for the
> writers instead of building its own. :-)
>
> This is probably very much OS/kernel/CPU/workload specific, though.
I guess it's caused by active contention. If there's other threads doing
stuff, then the threads competing on the lock don't hit it so hard, and don't
have to retry the CAS so many times. I expect they also hit the fast path more
often, too.
> Your
> algorithm definitely holds its own in all realistic scenarios.
Thanks. I'll commit it to thread_rewrite, as it's definitely better than
what's there at the moment (since that version deadlocks).
If you have any more thoughts on an improvement/new algorithm, I'd be glad to
see it. Your input is much appreciated.
Anthony
-- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk