I was looking over the implementation of the try_mutex class and noticed that internally it uses a Win32 Mutex and does a WaitForSingleObject with a timeout of 0 in the do_trylock method. The same thing can be achieved by using a CriticalSection and the TryEnterCriticalSection API which is light weight compared to using a Mutex with a WaitForSingleObject call.
TryEnterCriticalSection works as follows:
If there is no contention on the lock, the lock will be acquired in user mode by using a few lock prefixed assembly instructions.
If there is contention on the lock the TryEnterCriticalSection will return immediately.
The EnterCriticalSection API works similarly with the exception that when there is contention on the lock it will follow the same code path as the WaitForSingleObject call and not return until the lock is acquired.
WaitForSingleObject is heavy weight in that the work is always done in kernel mode with a much longer code path. Since TryEnterCriticalSection is only available on WinNT 4.0, Win 2K, and XP a possible a way to implement this is to restructure try_mutex to use the pimpl idiom and check the platform at the time of construction of the object. The results of the platform check can be cached to speed construction of future try_mutex objects.
Also, I found that it is more efficient to use WaitForSingleObjectEx and WaitForMultipleObjectsEx over the non-Ex versions. Tracing in the debugger reveals that the non-Ex versions call the Ex versions with the Alertable parameter set to FALSE. You’ll avoid the overhead of an extra function call by using the Ex versions directly.