Boost logo

Boost Users :

From: Gottlob Frege (gottlobfrege_at_[hidden])
Date: 2007-01-23 13:08:27


On 1/23/07, Ovanes Markarian <om_boost_at_[hidden]> wrote:
>
> Actually I read all your and Tony's points and may be I was misunderstood.

You were not misunderstood at all. I've gone down the same road as you.
More than once. With various techniques, including this create_getter vs
non_creating_getter idea.

My first question is:
>
> If mutex does not guarantee thread safety what then?

It only guarantees thread safety when used for ALL accesses of the shared
variables. Not just on the write of the shared variables. You need it for
both read and write. Not just because the shared variable may change
'while' you are reading it, but because it may have changed, but your
processor hasn't 'seen' those changes yet, even though it has seen other
changes that happened 'before' the shared variable changed. This is the
seeming paradox of DCLP and modern CPU architecture.

>
> //creative get
>
> Singleton* Singleton::creating_singleton_getter()
> {
> boost::mutex::scoped_lock lock(s_m); //allways called when
> entered
>
> //all other
> calls
> to this function
> // are
> blocking,
> so it is not possible
> // to enter
> this
> function twice if lock is active
> if(Singleton::pInstance == NULL)
> Singleton::pInstance = new Singleton; //does not matter
> how
> these steps are executed
> // and
> reordered by compiler, since the function
> // can
> only be entered when s_m is unlocked
> Singleton::getter = &non_creating_getter; //this is still
> guarded
> by locked mutex!!!

No, getter is not 'still guarded'. As soon as it is set, another thread can
now start using the non_creating_getter. What if the compiler DID reorder
the instructions?:

        Singleton::getter = &non_creating_getter; // line 1
        if(Singleton::pInstance == NULL)
                Singleton::pInstance = new Singleton; // line 2

certainly in this case, between line 1 and 2, another thread could come in
and start using non_creating_getter too early.

Now, imagine that it wasn't the compiler that reordered the lines, but
instead the processor (ie using speculative exection). Or not the
processor, but the memory bus. That's what happens. They will still appear
in order for the one processor, but not necessarily for another processor.
Worse, it depends on the platform, so this bug is not yet very visible, and
that's why we have so much code relying on it working. So much that I'm
surprised that chip makers even consider allowing the reordering to happen -
I would expect it to break too much code.

Similarly, by the way, you can even be sure the pointer pInstance is seen to
be set before the bytes of Singleton that it points to are seen to be
written!

} //mutex unlock

So let's tighten the mutex boundary:

Singleton* Singleton::creating_singleton_getter()
{
       {
              boost::mutex::scoped_lock lock(s_m);

               //aquire mutex here
               if(Singleton::pInstance==NULL)
                       Singleton::pInstance = new Singleton;
       }

       Singleton::getter = &non_creating_singleton_getter;
       return Singleton::pInstance;
}

Now the mutex is unlocked before getter is set - this puts a write barrier
between the 2 instructions - which means that THIS processor (and its
memory-handler queue) will NOT change the order of when getter is set. In
effect, it flushes the memory-write-request queue before getter is written
(or, more accurately, the request to write the global memory for getter, is
placed in the write-queue).

And this is where it gets fuzzy for me - from my understanding, it requires
the other processor (where some other thread is running) to queue up 2 read
requests:
  - 'read getter please'
  - 'read the bytes of the new Singleton'
and then have those requests reordered.

The oddity being why would the second request be in the queue before the
first request was answered - ie the second request *depends* on the answer
of the first. I can only imagine that this happens because of 2 reasons:
   - speculative execution - the CPU could see that it was 'probably' going
to read pInstance regardless of getter (which seems more plausible in the
traditional DLCP case where getter is just a flag, then an checked in an if,
so the CPU can easily look ahead).
   - the CPU (or memory controller) had recently read and cached the memory
where pInstance points, and didn't feel a need to re-read it (ie there where
no obvious dependencies and/or no reason that the memory should be different
since the last time it read or wrote that memory). Basically, the idea here
is that the CPU, as a single CPU, is consistent - it is only inconsistent in
the presence of other CPUs, and it depends on the architecture as to whether
those inconsistencies are allowed to exist or not.

And this is where/when you need to start asking on comp.programming.threads,
but I suspect they'll tell you (with better detail and understanding) the
same thing - it just doesn't work without a read barrier on the other
threads.

So the point is: As long as Singleton::instance is called from multiple
> threads and these are not created from global vars before main is called,
> this code should be thread safe.

I'm not sure what you are saying about before main, etc. If you are just
concerned about creating_getter being initially set properly, I agree you
are probably OK, since it is static initialization. My only concern there
would be, as mentioned, with Singletons inside DLLs / shared libraries - I
don't think loading shared libraries is thread safe under linux (which
boggles my mind, but that's what I've heard). And the standard doesn't say
anything about shared libraries.

The scenario is like this:
>
> Threads:
>
> A B C D
> instance instance instance //only A, B or
> C
> will get access to instance, other will wait
> instance // if creating
> get was successful, D calls the lightweigt version
> // of getter

The scenario is that D reads the 'new' getter, but still manages to read the
'old' (uninitialized) Singleton, because of crazy modern memory
architectures.

Static class variables are guaranteed to be initialized before main is
> entered:
> C++ standard 9.4.2 states:
> ...
> Static data members are initialized and destroyed exactly like non-local
> objects (3.6.2, 3.6.3).
> ...
>
> 3.6.2 states:
> ...
> Objects with static storage duration (3.7.1) shall be zero-initialized (
> 8.5)
> before any other initialization
> takes place. Zero-initialization and initialization with a constant
> expression are collectively called static
> initialization; all other initialization is dynamic initialization.
> ...
>
> So I assume, that initialization of getter with address of a (static)
> class
> function is a constant expression and therefore is not a dynamic
> initialization.

OK.

(Please see 5.19 of a standard especially:
> ...
> Other expressions are considered constant-expressions only for the purpose
> of non-local static object
> initialization (3.6.2). Such constant expressions shall evaluate to one of
> the following:
> ...
> - an address constant expression,
> ...
> An address constant expression is a pointer to an lvalue designating an
> object of static storage duration, a
> string literal (2.13.4), or a function.)
>
>
>
> Therefore there should be a guaranty that the Singleton static members are
> initialized before main is entered. The locked mutex guarantees that only
> one thread at one processor will enter the function at the same time.
> Isn't
> it so?

Yep, only one thread gets into the guarded part of
creating_singleton_getter, but non_creating_getter might still be seen and
used too early.

Thanks for your ideas and answers.
>
> Best Regards,
> Ovanes
>
>

I hope it makes sense - it didn't make much sense to me the first 10 times.
You might also want to try comp.programming.threads - it has been discussed
there a few times.

Tony.



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net