|
Threads-Devel : |
From: Matt Hurd (matt.hurd_at_[hidden])
Date: 2006-03-01 10:58:07
G'day,
>On Thursday 02 March 2006 00:58, Anthony Williams wrote:
> Hi all,
>
> I thought I'd start off some discussion about mutexes, to inaugurate our
> new mailing list.
>
> The mutex is the basic synchronization primitive for shared-memory
> threading models, such as that of boost.thread, so it is very important
> that the model, and implementation, is sound. With a working mutex, you can
> build more expressive synchronization mechanisms, such as the smart shared
> pointers proposed by Roland.
Doesn't just apply to shared-memory threading models. As the name suggests it
can be just to prevent concurrent access to a resource which may be
re-entrancy protection in a single threaded model. The concept is mutual
exclusion as per its name. Might be a physical thing that can only be
accessed one at time, like hardware registers of a device or some such.
It is also important to remember that this may apply to a singly threaded
restricted area in an application that is multi-threaded. For example, it is
unfortunate that boost::shared_ptr adopts its concurrency policy based on
whether or not it is a multi-threaded build. It is the conservative safe
approach but restricts the usefulness of the pointer. I use
boost::shared_ptr by default in my apps and replace it with a policy based
pointer, usually Loki's, where I'm performance critical w.r.t. a lack of
concurrent access. In practice this has only really been necessary where I
need reference counted access wear I know there will only be one per thread,
possibly different threads at different parts of the lifecycle but never
concurrent access.
> In C++, there are several primary usage scenarious for mutex objects:
>
> 1. Standalone, as an object of static storage duration at namespace or
> block scope.
>
> 2. As a class member, for a class object of static storage duration.
>
> 3. As a class member, for a class object created on the stack or heap.
>
> In each case, the mutex object is used to synchronize access to other
> objects at the same scope.
Not always. Different scoping is not that uncommon is practice I think.
On some platforms mutexes are not necessarily cheap resources and their number
has a finite boundary. You may wish for object level locking for a big
collection but you can't swing it. Older versions of windows are like this
for example where your are restricted by "system resources" of 64 to 128K.
Here a common structure might be a mutex pool, for example a vector of mutexes
where the access is hashed on object address, and your object is initialised
with a reference to a mutex in the pool that it shares with other objects.
A common structure we should support to prevent deadlocks is the idea of a
group of mutexes for simulataneous locking. Deadlock is guaranteed not to
occur if they are always accessed in the same order. Enforcement of ordered
access will prevent deadlock, a noble goal.
> It is essential that the mutex be correctly initialized by the time the
> first thread attempts to lock it. For objects of static storage duration,
> this requires that the mutex can be initialized with a static initializer,
> and not rely on dynamic initialization, since dynamic initialization
> exposes the possibility of race conditions.
Static mutexes are really handy and very annoying due to the static syntax
messiness of their static initialisation requirements. It would be nice if
there were some syntax sugar that lead to their safe concurrent
initialisation at start-up. The syntax you suggest below of the pthread like
BOOST_MUTEX_INITIALIS(Z)ER is good. It is necessary to support this as the
initialisation can then be at compile time and end up as a zero in the
executable image for example. The messiness of always needing external
initialisation of the static to the class and therefore cpp location or an
anonymous namespace trick makes me wonder if a registration process/macro
where all the static mutexes get initialise in a one off singleton object on
start up might be neat... Dunno.
> Constructor calls constitute dynamic initialization, so therefore we need
> mutex objects which are aggregates, with no members that require
> constructor calls, so there are no constructor calls. Ideally,
> zero-initialization would be sufficient, as then the user would not have to
> specify an initializer, but failing that, a static initializer akin to
> PTHREAD_MUTEX_INITIALIZER would work. e.g.
>
> boost::mutex m=BOOST_MUTEX_INITIALIZER;
>
> At the other end of the application lifetime, we have destruction. Since we
> have no idea how many threads will be running during application shutdown,
> or what order objects will be destructed, we need to be conservative here,
> too --- we need the mutex objects to remain usable for as long as possible.
> On the platforms which I know about, it is acceptable to leak OS resources
> such as mutexes, as these will be cleaned up by the OS on program exit.
>
> Therefore, I suggest that the most straightforward way to deal with
> destruction order race conditions is for the mutex objects to have no
> destructors (which is consistent with being POD).
This is unacceptable I feel because of the resource consumption issues on some
platforms.
> For the second use case above, we have a slightly different scenario. As a
> class member, we cannot use a static initializer, unless the enclosing
> class is also an aggregate. Therefore, we need to rely on zero
> initialization, or a constructor call.
>
> The constructor call in question could be the copy constructor, if we
> provide a function that returns a suitably-initialized object to copy, or
> we provide a statically-initialized "master copy". However, in either case
> the user then has to remember to explicitly initialize the member. e.g.
>
> class X
> {
> boost::mutex m;
> public:
> X():
> m(BOOST_MUTEX_DYNAMIC_INITIALIZER)
> {}
> };
>
> X x;
>
> If the user has an object of static storage duration which requires dynamic
> initialization, then it has to be up to them to ensure that the
> aforementioned possibilities of race conditions are dealt with. Also, once
> we're in to dynamic initialization, then this case is really akin to the
> third case.
>
> In the third case, as a member of an object on the stack or heap, we cannot
> rely on zero initialization, so some form of initializer is a must. The
> preference would be for a default constructor, but this clashes with the
> requirement for pure-static initialization above. If we can rely on
> zero-initialization, just requiring explicit initialization with an empty
> initializer is enough, but that's easy to miss. The dynamic initializer
> suggested above would work, as would an explicit initialize() function, but
> both of these are cumbersome.
>
> In this case, destruction is actually now an important issue --- if the
> object does not live for the length of the whole application, then leaking
> OS resources is not acceptable. We could provide an explicit destroy()
> function, akin to pthread_mutex_destroy(), but in C++ this is really the
> territory of destructors.
>
> I therefore suggest that we supply two mutex types, one of which is POD,
> and requires a static initializer, and one of which has a default
> constructor and a destructor. Of course, if we equip the first one with
> explicit initialize() and destroy() functions, then the second can just be
> a wrapper for the first that calls initialize() in the constructor, and
> destroy() in the destructor.
>
> I suggest that these two different variants are named boost::static_mutex
> and boost::mutex.
>
> This suggestion is orthogonal to any other variation in the mutex type, so
> we would also have boost::static_try_mutex vs boost::try_mutex, and so
> forth.
>
> I have implemented my suggestion for Win32, except at the moment the types
> are named boost::detail::basic_mutex and boost::mutex. I have also
> implemented boost::detail::basic_recursive_mutex and
> boost::recursive_mutex. boost::try_mutex and boost::recursive_try_mutex are
> also implemented, as simple extensions, since the basic_xxx variants
> support try_locks. These are all in the boost CVS, on the thread-rewrite
> branch. The reason I have left the basic_xxx classes in boost::detail is
> because they are just primitives; they expose the lock/unlock functionality
> directly, rather than through the scoped_lock classes dictated by the
> current interface docs.
>
> When I last looked at this, I named the basic_mutex version
> lightweight_mutex, and had a discussion about it with Roland. The earlier
> version is still in CVS, but is no longer used for boost::mutex. I believe
> the current version is marginally simpler, and also
> boost::detail::lightweight_mutex is a name used by the shared_ptr
> implementation, which is why I have changed it to basic_mutex. The old
> implementation has been left in, as I haven't yet changed my implementation
> of read_write_mutex, which uses it.
>
> One of Roland's concerns was that this essentially duplicates
> CRITICAL_SECTION functionality.
Nothing wrong with that a critical section is a mutual exclusion guarantee.
> The lightweight_mutex used with shared_ptr
> is essentially just a wrapper around CRITICAL_SECTION on Windows, so why do
> we need anything different?
>
> The answer to this is use case 1 above --- CRITICAL_SECTION objects don't
> have static initializers; they require a call to InitializeCriticalSection,
> or InitializeCriticalSectionAndSpinCount before use.
>
> I could have just used a Win32 Mutex object, but that's quite
> heavyweight. Instead, the code uses a Win32 Semaphore, alongside a lock
> count. The semaphore is only used if there is contention for the lock, so
> if contention is rare there is a performance boost.
>
> basic_recursive_mutex wraps basic_mutex, but also storing the locking
> thread id, and the recursion count. If the same thread tries to reacquire
> the lock, it just increases the count, and it only releases the lock when
> the count hits zero.
>
> The next challenge is timed_mutex --- this cannot be written as an
> extension of the current scheme, since that would expose a race condition.
> I might have to revert to my lightweight_mutex scheme for that.
Hmmm, the mutex isn't timed. The lock is timed, or rather, the attempt to
lock is timed. Important difference.
> I also have a "checked" version of basic_mutex (not in CVS), that checks
> which thread owns the mutex, and throws if the same thread tries to lock it
> twice, or a thread that doesn't own it tries to unlock it. This requires
> additional storage for the ID of the locking thread, and also has the
> overhead of the additional checks. Does anyone think it's worth adding?
Perhaps, especially for debugging. I also wonder if we should require a name
concept for a mutex as this would derive some orthogonality across shared
memory mutexes across process space. Also a debug version of a mutex could
have conncurrent tests as well as logging to simply debugging. Debugging
errors in concurrent access is one of the nasiest and hardest things in
programming. A debug mutex that can report its name and a deadlock condition
and the names of the calling threads, for example, would be rather magic and
very simple to implement. Worth its weight in electrons me thinks.
Some more food for thought.
A condition variable relies on a mutex. We haven't considered the
orthogonality of a condition being able to be built around any mutex concept.
This is important to consider as it would be great to do but makes building
conditions very difficult as we have to write one rather than rely on just a
posix wrapper on posix platforms.
A shared mutex and null mutex need to be considered.
Boost misses this requirement completely and it is devastating. Libraries
like ACE have support this for many years. We can go a step further and
support a substituable conceptual heirarchy like this:
shared_mutex
^
|
recursive_mutex
^
|
simple_mutex
^
|
null_mutex
This is a conceptual heirarchy which gives the Liskov subsitutability and
enables generic code to be written to shared mutex with the knowledge that
the don't pay for what you don't use concept is alive and good as you can
throw in a null mutex and have a normal class/object.
I use this pattern a lot and have a synch wrapper that wraps the boost mutex
implementation to give me this. Such a wrapper also exposes other synch
stuff, like atomic ops and fencing, so these can be used w.r.t. to the
conceptual need at the time, so to speak.
In my opinion all of STL should support such a concept for the containers and
this could be an extension that boost offers.
I think the statically initialise mutex concept is orthogonal to this but I
don't have a clear picture in my head.
Going back to the containers for a second there is a natural two level mutex
concept that we should think about doing out of the box. A container has
activity is at a container level _and_ each object in the container may have
its own mutex requirement. There is a relationship here as a common pattern
is for resource pools of mutexes for a container to limit resource
consumption and provide better scalability than a single mutex for all
objects. So there is a container access mutex and a mutex per object and the
mutex may be one and the same for simple requirements.
Deadlock becomes a real possibility with such things as the potential ordering
of access needs to be constrained. E.g. if you need to lock on two or more
objects in the collection then the locks should be required to be accessed in
the same order or a scheme where if you are locked and need another object
then you may have to give up your lock and attempt then attempt to acquire
both, otherwise you're inviting a deadlock.
Containers of containers make this even more interesting :-)
$0.02
matt.