Boost logo

Boost Users :

Subject: Re: [Boost-users] [Thread] thousands of spurious wakeups from timed_wait() per second
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2014-09-11 16:51:12


Le 11/09/14 18:17, Steve Clark a écrit :
> Thank you for your interest!
>
> Vicente J. Botet Escriba <vicente.botet <at> wanadoo.fr> writes:
>
>> Le 10/09/14 04:38, Steven Clark a écrit :
>> Which platforms are you testing?
> In the original post, I erred in some details regarding platforms. I am
> currently testing on two virtual machines, both Virtual Box running Ubuntu
> Linux (64-bit), and hosted on two different computers running Windows.
>
> * One is running Linux 3.13.0, Ubuntu 14.04.1, and Boost 1.54.
> * The other runs Linux 3.2.0, Ubuntu 12.04.5, and Boost 1.46.
> * My intended target is an Intel arm i.MX27 running Linux x.y.z and Boost
> 1.48.
> I have not run my tests on the intended target yet. I can move to a
> different version of Boost if necessary but it's a nuisance. Is this the
> platform information you wanted?
Thanks for the details. Could you check directly on windows?
>> Could you test with the develop branch?
> That would be a nuisance. Have there been relevant changes since Boost
> 1.54? It is much more recent than what I mentioned in the original post. I
> will try the development branch if you're sure it is worthwhile.
No. It is not worth if you have tested with 1.54. But version 1.48 is a
very old version.
>>> I'm also having trouble with the return value of timed_wait().
>>> The documentation carefully states that timed_wait() returns "false if
>>> the call is returning because the time specified by abs_time was
> reached,
>>> true otherwise". This seems to mean that spurious wakeups return true
>>> -- and if you think about it, the return value is useless if spurious
>>> wakeups could return false. You'd have to add your own check for
> timeout.
>> the function return true if there were a notify on this condition. This
>> doesn't mean that another thread has not already changed the result of
>> the predicate.
>> the function return false when there were no notify before the timeout
>> was reached.
> Let me restate what I think you said, which is also what I've observed so
> far. Timed_wait() can return for three reasons: notification, timeout, or
> spurious. It returns true on notification, and false on timeout or
> spurious. (This is contrary to the documentation, which appears to be
> carefully written.)
No, it can return either because notified or timeout. A spurious
notification is one that doesn't satisfy the predicate you expect after
wait. This could be because another thread has changed in the mean time
the condition.
> I find it hard to believe that normal wait() cannot tell the difference
> between spurious and notification, yet timed_wait() can. Granted, I don't
> understand the issues that lead to spurious wakeups.
Both wait and timed_wait can have associated spurious notifications.
>
> Perhaps you meant that timed_wait() returns true for notification or
> spurious, and false for timeout or spurious.

No. When the timed_wait returns false, the condition has not been
notified, so there is no possible spurious notification in this case ;-)
> That makes no sense to me
> either - there's no point in timed_wait() having a return value. What
> decision can the caller reliably make on the basis of such a return value?
If there is a timeout, this gives you an indication that maybe there is
no thread that will notify you, protecting you blocking forever.
> Thank you for running my test on a Mac. Your output is what I expected to
> see.
>
> I converted the test program to make pthread calls instead of Boost. It
> works perfectly, just like your Mac run. This is unsatisfactory because
> eventually I want the same code to run on Windows.
Agreed. Could you check directly on windows?

Another possibility could be to use the chrono interface to see if the
bug is around the date-time interface.

#include <boost/chrono/chrono_io.hpp>
...

         boost::chrono::steady_clock::time_point deadline = boost::chrono::steady_clock::now();
         std::cerr << "now is " << deadline << std::endl;
         deadline += boost::chrono::seconds(2);
...

             boost::cv_status ret = gCond.wait_until(lock, deadline);
             nWakeups++;
             if (boost::cv_status::no_timeout==ret)
...
> I've been suspicious that timed_wait() isn't actually waiting at all - just
> doing something like yield().
Yes, the date on your traces indicates that the timeout has not been
reached at all, and that the notifications are done every second, as
expected. The problem been the erroneous timeouts.

now is 2014-Sep-09 20:48:12.183781
now is 2014-Sep-09 20:48:12.215471
now is 2014-Sep-09 20:48:13.216511
now is 2014-Sep-09 20:48:14.216805
now is 2014-Sep-09 20:48:15.217191
now is 2014-Sep-09 20:48:16.217453
now is 2014-Sep-09 20:48:17.217685
now is 2014-Sep-09 20:48:18.217291

> When I added a check for the current time vs.
> the timeout deadline into my test program, the number of spurious wakeups
> dropped almost to half, suggesting to me that the
> boost::posix_time::microsec_clock::local_time() call almost doubles the time
> it takes to run through the inner loop.
>
> Are there any compiler flags that direct how Boost implements its thread
> functions? I think I ran across something once, something like
> BOOST_POSIX_THREADS, but I haven't found it and I have no idea what the
> alternative to pthreads might be.
>
Sorry, I don't understand what you mean. Could you rephrase?

Best,
Vicente


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net