Boost logo

Boost :

From: williamkempf_at_[hidden]
Date: 2001-08-09 09:14:13


--- In boost_at_y..., Ross Smith <ross.s_at_i...> wrote:
> williamkempf_at_h... wrote:
> >
> > In what specific ways? So far the complaints (unless I miss the
> > intent of what you said) are along the lines of atexit() should be
> > called onexit() so it's less than optimal.
>
> As far as naming goes, my main problem is with the terms "join" and
> "detach". I've never been able to figure out why pthreads uses those
> peculiar names. I can't see _any_ laboratory-detectable trace of a
> connection between what pthread_join() does and the normal English
> meaning of the word "join". Presumably the designers of pthreads had
> some rationale in mind, but I have no clue what it might have been.
I
> don't really think they picked the word by sticking a pin in a
> dictionary, but they might as well have for all the mnemonic good it
> does.

Well, I really don't want to get into naming wars, unless names are
truly not appropriate. So I'm mostly not going to touch these at
this stage. However, the terms are not used because of POSIX, but
because of industry standard terms. Most of these terms were chosen
long before POSIX came into being, and even Windows programmers know
what they mean. We don't invent new words when there are well known
words already.

> "Detach" isn't quite as bad, because I can at least see some kind of
> vague connection between what pthread_detach() does and the usal
meaning
> of the word, but I still think it's a lousy choice of terminology.
(But
> then, I also think it's a lousy choice of functionality -- see
below.)

If it makes you feel better, Boost.Threads doesn't have a "detach"
method. There's a detached state, but no method called to get there.

> "Wait" and "abandon" have always seemed to me to be the self-
evidently
> obvious terms for these operations. I don't particularly insist on
those
> _exact_ words, and I think your comparison to something ridiculously
> trivial like "atexit" vs "onexit" is a bit offensive, but I do think
> it's reasonable to insist that the name of a function should at
least
> suggest what it actually does, and I can't see how "join" can
possibly
> do that, except to someone already familiar with pthreads.

No offense was meant, and I find it surprising that you'd take
offense. The term wait is familiar enough to me to be valid, though,
again, the industry used term (not the POSIX term) is "join". As for
abandoned... that really has a slightly different technical meaning
that's not applied to threads but to synchronization objects.
 
> > Or in a few cases you seem to complain that the interfaces provide
> > more functionality than *you* need, making them less than optimal.
> > [...] it's always better that a library provide more functionality
> > than less (provided it doesn't complicate usage).
>
> I think it was Einstein who said that one should make things as
simple
> as possible, but no simpler.

And removing any functionality that's included so far would put us
into the second category, not the first. That's why I said "provided
it doesn't complicate usage".
 
> I don't object to functionality I don't personally need, if someone
else
> needs it. What I haven't seen is any evidence that anybody else
_does_
> need those things. Without a demonstrated need, they're just
gratuitous
> complexity.

I've given you reasons in the past for "detached" threads. I've also
pointed out that MFC creates detached threads by default. With every
threading library available including this concept I find it hard to
understand why you can't accept the need for it. More importantly, I
can't comprehend how you feel it complicates things in any way.
 
> One was thread specific storage. I can't see the point; what does
it do
> that plain ordinary stack or heap allocation from inside the thread
> doesn't do? If you can provide a rationale, "it does X that ordinary
> storage doesn't do, and that's useful for achieving Y", I'll be
happy to
> withdraw my objection.

It provides "global" data accessible only to a single thread of
execution thus needing no synchronization. The classic example is to
make methods such as strtok "thread safe". The strtok method relies
on global data to retain the state of tokenizing a string between
calls. This makes it an unsafe method to use in MT programs because
there's no way to serialize this global data in a thread safe
manner. Change the global data to a TSS data, however, and the
problem is automatically solved.

It's possible to instead have the caller allocate this state info and
pass it in to the strtok method, thus insuring thread safety
externally. For this simple example this works nicely making it less
compelling of an example to some, but it's the easiest to
comprehend. There are numerous cases where designs can't be changed
in this way, however. For example, there's a lot of legacy code that
uses the original strtok interface and forcing them to change just to
be thread safe results in a lot of work (possibly impossible work to
do if you don't have access to the code calling strtok), while
switching to TSS is simple and transparent. Of there's the use of
TSS in Boost.Threads itself, where state information about the thread
is stored in TSS data. It's not really possible to achieve the same
thing using any other technique.

The times you need TSS are rare, and are usually only encountered in
library code, but they do occur and we must address them. Doing so
doesn't complicate anyone's life but the implementor of
Boost.Threads, so arguing against their presence seems awfully
strange.

> The others are abandoning/detaching/whateveryoucallit and
cancellation.
> Those are really two instances of the same problem: asynchronously
> killing a thread, without stack unwinding.

Who said anything about "with out stack unwinding"? You are showing
your Win32 bias here. Win32 does this with out any stack unwinding,
yes, but not all thread implementations do this. POSIX specifically
does a C style of unwinding through the use of cleanup handlers, and
more importantly it was carefully designed to allow C++ style stack
unwinding as an implementation detail. As I've said, again and again
to you, Boost.Threads currently doesn't include cancellation because
of issues in getting the stack unwinding done "right"
and "portably". It won't be included until an implementation can do
both. So much for the complaint about cancellation.

Detached threads, on the other hand, are by their very nature
designed to not have problems related to this. Stack and heap data
is collected automatically when the process terminates (which is the
same as when the thread is terminated) and other resources are either
not used or are reclaimed via exit handlers. Even given the danger
that the thread may not be designed correctly, once we have proper
cancellation semantics this will be a non-issue, since running thread
will be cancelled before main() returns, so we're including the
ability to begin with despite the (minimal) danger. In the end, if
you find the danger too great you simply don't create detached
threads yourself. No added complexity and no compelling reason to
remove the needed functionality.

> The whole concept is a
> consequence of people thinking in C, where the equivalent of stack
> unwinding has to be done explicitly by hand, and so people don't get
> used to thinking in terms of automatic cleanup.

Trust me, I'm not thinking in C. I've only programmed in C for a
single semester in college, and that was an awfully long time ago.
My thought process is firmly rooted in C++ and it's inherant design
characteristics.

> C++ depends on proper
> stack unwinding on a very fundamental level, and I am absolutely
> convinced that, short of the kind of total disaster to which abort
() is
> the only reasonable response, full stack unwinding semantics and
purely
> synchronous exceptions must _always_ be enforced rigidly.

Synchronous exceptions don't follow the same reasoning. Granted,
there are difficulties in asynchronous exceptions, but they are not
prone to any of the problems you've discussed in this thread thus far.

> The
> possibility of a thread being terminated without stack unwinding,
or (if
> the Java model is followed instead) of an exception popping up at an
> arbitrary point, makes sane programming utterly impossible.

Asynchronous exceptions make coding more difficult, but hardly
impossible. When such exceptions can be thrown only at well defined
points, and they can be disabled during certain operations even if
these points are reached, many of the problems with asynchronous
cancellation disappear.
 
> (It's the same reasoning that leads to my intense dislike of garbage
> collection, and hence of Java, C#, Python, etc ad nauseam)

Then we may be delving into the territory of religious wars, and
maybe we should both bow out.

Bill Kempf


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk