|
Boost : |
Subject: [boost] Problems with yield_k workaround. Still too slow.
From: Marcello Pietrobon (marcello.pietrobon_at_[hidden])
Date: 2013-08-21 01:31:58
Hello Boost team,
As exposed by Gav Wood in
http://boost.2283326.n4.nabble.com/Boost-Interprocess-Chronic-performance-on-Win7-td3755619.html
Windows doesn't handle well the contest switching between threads.
Basically if a thread is given a time to sleep for 1ms then the OS puts that
thread on sleep for its entire timeslice (~20ms), no matter if the
concurrent threads need a much shorter time to complete their job.
The effect of this choice is that if you need to have two threads talking
with each other through an emulated synchronization object, they take well 2
seconds to do just 100 back and forth.
At least it seems to me that this problem arises only when *emulating * a
synchronization object, which I understand is necessary, for example, in the
interprocess library.
It's not a problem of easy solution.
There is an ongoing discussion in the interprocess library on how to fix
this (see for example
http://boost.2283326.n4.nabble.com/Boost-Interprocess-conditions-variables-get-10-times-faster-when-opening-a-multiprocess-browser-td4650763.html
) but clearly this is not a problem limited to it.
The workaround
http://www.boost.org/doc/libs/1_54_0/boost/smart_ptr/detail/yield_k.hpp
improves the situation, yet only a bit.
In order to isolate the problem I've run some tests using the interprocess
library, with the example comp_doc_anonymous_conditionA.cpp (and B).
This test is slowed down by this time of a factor ranging from 2 to 10, 100
or even 1000 or more times depending on how soon that infamous Sleep(1) is
triggered.
*The bottom line is that in yield_k.cpp we need to change the constant in
the code *
from 32 to *1000 * at least.
I call this constant kMax:
Here some test results, running XPsp3 on Xeon E5450 - dual socket 32bit CPU
quad cores. 2GHz core speed.
For test I changed comp_doc_anonymous_conditionA.cpp :
that is 100000 iterations,
while in comp_doc_anonymous_conditionB.cpp I've added the line
just to emulate a job done by the thread while owning the mutex.
That is:
The results are (these numbers are representative of many tests I've run) :
kMax = 32, jMax = 0 : time = 00:00:01.125000
kMax = 50, jMax = 0 : time = 00:00:00.403750
kMax = 100, jMax = 0 : time = 00:00:00.2803750
kMax = 65535, jMax = 0 : time = 00:00:00.203125
kMax ~= 64 is good
kMax = 32, jMax = 10 : time = 00:00:32.843750
kMax = 100, jMax = 10 : time = 00:00:02.750000
kMax = 1000, jMax = 10 : time = 00:00:00.859375
kMax = 65535, jMax = 10 : time = 00:00:00.328125
kMax ~= 1000 is good
kMax = 32, jMax = 100 : time = 00:04:01.953125
kMax = 50, jMax = 100 : time = 00:00:04.296875
kMax = 1000, jMax = 100 : time = 00:00:02.062500
kMax = 65535, jMax = 100 : time = 00:00:01.593750
kMax ~= 1000 is good
If the job takes longer we need to increase kMax even more,
but it seems that a value of 65535 cuts the problem and makes sure that a
locked thread doesn't take most of the cpu time for more than one timeslice.
-- View this message in context: http://boost.2283326.n4.nabble.com/Problems-with-yield-k-workaround-Still-too-slow-tp4650929.html Sent from the Boost - Dev mailing list archive at Nabble.com.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk