From: Tobias Schwinger (tschwinger_at_[hidden])
Date: 2007-08-26 05:37:17
Andrey Semashev wrote:
> Hello Tobias,
> Friday, August 24, 2007, 4:19:43 AM, you wrote:
>>> Just came across this thread. I had a need of lightweight_call_once in
>>> my Boost.FSM library and implemented it. It is not implemented as an
>>> internal part of the library, but rather as a common tool, like
>> Something you'd like to brush up as a Boost X-File ;-)?
> Hmm, I'm not sure of the purpose of this project. Is it supposed to
> pass several tools under its umbrella to boost via fast-track review?
Sort of. It's just an idea, so far.
Its purpose is to avoid lots of fast-track reviews (and reviewing
overhead) for utility components by grouping them into a "pseudo
library", thus encouraging developers to brush up / factor out useful
>> The pthreads implementation seems to be using a global Mutex, which is
>> inefficient, because it causes concurrent initializations (that might
>> have nothing to do with each other) to be queued.
>> To make things worse, that Mutex is initialized with 'pthread_once'.
> Yes, but consider that this code will be executed only once. The rest
> of the execution time this mutex is useless.
Consider the deadlock if 'once' is used recursively to initialize
Further, it's quite unintuitive that a trivial initialization might get
slowed down by one in another thread that takes a lot of time.
> As mutexes may actually
> take some system resources (not sure whether it's true or not on the
> wide variety of platforms out there), having a separate mutex for
> every call_once is a direct waste of them.
You can call 'pthread_mutex_destroy' once you're done with the mutex to
free up eventually acquired system resources.
>> Also, some platforms will not call 'mutex_destroyer' within a dynamic
>> library (you probably know)...
> No, I'm not aware of this. Could you elaborate, please? Which
> platforms are those?
No ctors/dtors are run in static context for shared libraries on most
>> The "trigger" could contain the mutex and the macro for initialization
>> would contain PTHREAD_MUTEX_INITIALIZER, so its creation can be done at
>> compile time by setting up the appropriate bytes in the data segment
>> (interestingly, you use a similar technique for the "no atomics variant").
> In the "no atomics case" I had no other choice as I needed a mutex to
> safely read the once flag.
>> Other implementations use "while (check) sleep; stuff", which seems
>> sorta awkward to me. Can't we use "proper" synchronization?
> The fundamental problem arises here - I need to safely create a
> synchronization object. Non-POSIX APIs don't provide things like
> PTHREAD_MUTEX_INITIALIZER or I didn't find them in the docs.
I see. Would it be an option to use 'yield' instead of 'sleep'?
> besides that, the aforementioned drawback with waste of resources
> comes up again.
Again, explicit disposal will do the trick.
>> Win32 provides the 'InitOnceExecuteOnce' which seems to do pretty much
>> all we need (it even takes a parameter to get a the state in and
>> boost::function and a downcast from 'PVOID' will do for the type
>> erasure). After putting another guard around it (to avoid the dynamic
>> call and the construction of the boost::function) we're all done.
> InitOnceExecuteOnce is available since Vista and up. I'd like not to
> introduce such constraints on execution platform. Although, there
> could be an alternative implementation for WinAPI, for ones that will
> not execute their apps on XP, for example.
Bummer! Would've been too easy...
>> I don't know too much about other threading platforms, but I'm sure
>> there are similar means. We could also use a counter for the guard to
>> use a Semaphore (if it's more handy to do so for some platform) to
>> notify threads waiting for initialization to complete:
> See my note above. I can't safely create a single semaphore or mutex
> by an API-function call (which was not shown in your code sample). You
> may see my dancing around creating a semaphore in BeOS implementation
> to feel the problem.
I (maybe falsely) assumed that one could obtain one, statically. I'm
aware that "by-call initialization" is problematic (as we'd need 'once',
once again ;-)).
>> For some platforms (such as x86) memory access is atomic, so atomic
>> operations are just a waste of time for simple read/write operations as
>> the 'is_init' and 'set_called' stuff.
> The point is not only in atomic reads and writes, but in performing
> memory barriers too. Otherwise the result of executing the once
> functor could not have been seen by other CPUs.
Then the memory barriers will suffice for x86, correct? As this code is
executed on every call, any superfluous bus-locking should be avoided.
Alternatively, doing an "uncertain read" to check whether we might need
initialization before setting up the read barrier might be close enough
>> There's some code that throws exceptions with pretty, formatted error
>> messages: So we're out of resources and execute a whole bunch of code to
>> format an error message... That code might run into the same problem
>> we're trying to report, so probably throwing something lightweight (such
>> as an enum) is a more appropriate choice (and also gets us rid of some
>> header dependencies).
> Well, you may be right here. I could try to reduce memory allocations
> in error handling.
> But the only possible problem I see there is memory depletion. In such
> case you'll get std::bad_alloc which adheres the declared interface of
> the implementation. So, strictly speaking, if you have enough memory
> you get a detailed error description. If not, you get bad_alloc.
Depending on 'lexical_cast', 'iostream' and 'string' still slightly bugs
Another potential issue: It seems Win32 and MacOS variants are currently
not exception-safe. That is, the initialization routine isn't rerun if
it has thrown the first time 'once' was called.
Boost list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk