Boost logo

Boost Users :

Subject: [Boost-users] [thread] boost::call_once() optimization idea on Win32
From: Thomas Jarosch (thomas.jarosch_at_[hidden])
Date: 2010-11-17 11:15:38


Hello,

here's a small optimization idea for boost::call_once() on Win32.
The code currently looks like this:
----------------------------------------------
template<typename Function>
void call_once(once_flag& flag,Function f)
{
    // Try for a quick win: if the procedure has already been called
    // just skip through:
    long const function_complete_flag_value=0xc15730e2;

    if(::boost::detail::interlocked_read_acquire(&flag)!=function_complete_flag_value)
    {
        void* const mutex_handle(::boost::detail::create_once_mutex(&flag));
        BOOST_ASSERT(mutex_handle);
        detail::win32::handle_manager const closer(mutex_handle);
        detail::win32_mutex_scoped_lock const lock(mutex_handle);

        if(flag!=function_complete_flag_value)
        {
            f();
            BOOST_INTERLOCKED_EXCHANGE(&flag,function_complete_flag_value);
        }
    }
}
----------------------------------------------

The initial flag value of zero (=BOOST_ONCE_INIT)
is set at load time via static initialization.

It should be safe to do a non-interlocked read of the flag like this:
----------------------------------------------
if(flag!=function_complete_flag_value &&
  ::boost::detail::interlocked_read_acquire(&flag)!=function_complete_flag_value)
----------------------------------------------

If our non-interlocked read would see something different than
"function_complete_flag_value", we still do the interlocked read, too.
For normal operation inside a thread safe singleton,
this saves us from always issuing a memory barrier.

Any flaws with this approach?

Also one more (silly?) question: "flag" is not a volatile variable.
Does boost::detail::interlocked_read_acquire() make sure the value
doesn't get cached inside a register? IMHO we lock the mutex
and we might still hold a cached, old value of "flag" in a register.
-> Do we need an interlocked read here, too?
   Or mark the flag type "volatile"?

(http://msdn.microsoft.com/en-us/magazine/cc163405.aspx#S3
 look at figure 7 and the text below)

Best regards,
Thomas Jarosch




Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net