Boost logo

Boost :

From: John Max Skaller (skaller_at_[hidden])
Date: 2001-06-23 11:53:11


Beman Dawes wrote:

> I've written a first draft of a Formal Definition of "Thread". It attempts
> the definition in terms of the C++ Standard. It doesn't contain exact
> wording that would go in a future standard, but it at least identifies a
> number of areas of concern.
>
> Because it is checked in on a branch, the URL is a beauty:

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/boost/boost/libs/thread/doc/Attic/thread_formal_definition.html?rev=1.1.2.1&content-type=text/html&hideattic=0

        you aren't kidding!

> Comments would be appreciated.

        This is the kind of thing we need. However, it is important to separate
normative text from comments more clearly. for example, the description
of thread states is not normative text, it is just a comment to make
it easier to grasp the intent. Please read the comments below fully,
before replying: generally, I'm going to destroy the text, then try
to find how to write the text so the intent is properly stated.

        In this section:
--------------------------------------------------------------------------------------------------------
Each thread shares certain aspects of its execution environment with all
other threads in the program:

      Static storage duration (static, extern) objects [3.7.1].

      Dynamic storage duration (heap) objects [3.7.3].

      Unless otherwise specified, resources provided by the operating
system. For example, files.

      The program itself. In other words, each thread is executing some
function of the same program, not a totally different program.
-------------------------------------------------------------------------------------------------------

        This is the right idea, but the wording is not right.
In general, you can't talk about 'sharing' things like this,
it has no normative meaning. Instead, you must specify the behaviour
of various things which are well defined, and those whose
semantics are implementation defined.

        To give you an example: two threads can share auto variables.
They can pass a pointers. All memory is shared. You can't mention
'operating system resources' because that isn't a term of the
abstraction.

        Hence: this section should be elided, since it make no
normative contribution to the definition of the abstract machine.
Now consider:

-----------------------------------------------------------------------------------------------
Each thread has its own:

      Unique thread identifier.

      Priority and other specified thread attributes.

      Registers and current execution sequence (program counter)
[1.9/5].

      Automatic storage duration (stack) objects [3.7.2].
-----------------------------------------------------------------------------------------------

I disagree with the first two points: apart from being meaningless,
there in an implied constraint on implementations which is not
necessary.

Point 4 (thread local stack) must be dealt with, but there
is no need at all to say that these objects are local,
that is already implied by the existing model: when
the definition of an auto variable is elaborated,
storage is allocated: there are existing requirements
to support recursion which ought to be sufficient
to support threading. So this point can be elided.

Point 3 is absolutely central. But you cannot
'extend' the abstract machine. You have to CHANGE
the abstract machine specifications. Consider clause
1.9/4: it says that a given instance of the abstract
machine may have 'more than one possible execution sequence ..'

It is talking about sequence points, etc. But this text
is gibberish in the presence of threads, because it
implicitly assumes that elaboration of a program
consists of only one execution sequence.
The idea of threads, of course,
is that each _thread_ has this property.

So the text above (and 1.9/5 and probably various other
places) must be changed. Note that 'execution sequence'
is an undefined term, but it is related to behaviour.

Lets do some analysis: we need a notion of 'obvservable event'.
This is basically 'calls to standard library functions'
because that is the only way that a conformance tester
can check compliance (by inspecting output, or hooking
the various calls).

Currently, there is a notion of behaviour consisting
of a sequence of (observable) events. It is this
we must change. The obvious change is:

        The (obvervable) behaviour of the abstract
machine consists of a sequence of sets of (observable) events.

        We have just allowed arbitrary concurrent execution.
I'm going to have to think how to specify what the behaviour of
a thread is: clearly, it is a sequence of events, taken from
the sequence of sets of events (and preserving order),
but it is more than that (it isn't an arbitrary sequence,
but one which consists of the events generated by
changes in a particular locus of control (AKA program counter).

        Where the Standard currently says 'execution sequence'
it must be implicitly qualified by 'of a particular thread'.
A clause needs to be added : the behaviour of the abstract machine
consists of sets of events, such that for all events e1, e2
such that e1 and e2 are events of the same thread, and such that
e1 occurs before e2, then the set of which e1 is a member in
the behaviour occurs before the set of which e2 is a member.

        [If you think that is long winded, then thats what
comes from misuse of terminology: a short specification
is that the inclusion injection from the ordered set of
events of each thread to the behaviour is a functor. :-]

        Now, this description makes the rest of the Standard
normative when it talks about 'sequence of execution'.
However, it isn't enough, because it provides no way
for vendors to specify what locks, etc, do.

        In the rest of your paper, you provide an excellent
starting point for such a model. But note that the
states, such as 'blocking' have NO MEANING. They are just
arbitrary words, used to specify rules, so that IF a vendor
uses those words to describe some things (like

        the call to 'lock(xxx)' places the thread in
a blocking state until another thread executes unlock(xxx)

then the requirements of the abstract machine
are such that the use of the word 'block' has normative
implications for their implementation in terms of
sequence points, which means that one can deduce
that some access to memory will be OK because the
events for the two threads accessing the memory
CANNOT BE IN THE SAME SET of the abstract machines
behaviour sequence. [More precisely, I guess that
it takes time to do anything, so we're probably saying
that the sequence points before and after the access
do not overlap.

        It is important to understand that (an instance of)
the abstract machine is supposed to be a formal axiomatic system,
with some undefined terms such that a (conforming) implementation is an
_interpretation_ in the in the sense of formal logic. Therefore,
it is vital that some terms such as 'blocking' are undefined.
It is important NOT to define them -- although you can
describe a possible interpretation for ease of understanding,
such description is not a definition, and has no normative content.

        It would be good to try to produce a _completely_
formal document: no descriptive stuff. The reason is,
that in doing so, you will rapidly discover what the real
implications are. For example you say that

        only if the object has not modified by any thread between the execution
        of the thread A and thread B sequence points indicated

but saying 'has not been modified' is gibberish. you have to say

        only if there is no behaviour in which a thread has
modified the object and not reached a synchronisation point

[Where 'synchronisation point' is a yet to be defined sequence point
which also guarrantees that memory writes will be observed by
other threads]

        In other words, you can't say 'has not been modified'.
You have to say 'could not possibly have been modified
or has been modified and the effects propagated'
and you have to say it entirely in terms of the abstract
machine formalism.

> Two points to bring up over and over again when you talk to core language
> people (the compiler writers) are that:
>
> 1) The LWG is talking purely in terms of an optional thread library. A
> compiler will not have to support it if they don't wish to. The standard

> will not mandate whether or not a compiler must supply a threading library;
> it is up to the vendor.

        That is acceptable. The key point is to modify the abstract
machine so that the semantics of a vendor library can be specified.
This is much more important that having a standardised threading
library, even an optional one, although that would be really nice
if a consensus can be found for a proposal that supports major
platforms and provides the programmer with enough functionality
to write a reasonable class of programs.
 
> 2) For many them what we are asking is that the standard be changed to
> reflect what their compilers are already doing. Their compilers are
> already supporting threading libraries; the idea is to standardize existing
> behavior, not invent a whole new set of behaviors.

        Yes.

-- 
John (Max) Skaller, mailto:skaller_at_[hidden]
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk