Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2005-08-18 12:04:53


Oliver Kullmann <O.Kullmann_at_[hidden]> writes:

>> >
>> > 9. Why isn't thread cancellation or termination provided?
>> >
>> > There's a valid need for thread termination, so at some point
>> > probably will include it, but only after we can find a truly safe
>> > (and portable) mechanism for this concept.
>> >
>> > ----
>> >
>> > How do you cancel threads?
>>
>> The explanation in the FAQ doesn't explain the problem in details.
>> Here's a brief example:
>>
>> bool f1(X x)
>> {
>> X * y = new X;
>> // if f1() is aborted at this point
>> // *y won't be deallocated
>> delete y;
>> return false;
>> }
>>
>
>
> So we would get a memory leak.

It's not just memory leaks that are an issue. You generally get
undefined behavior if the program continues without properly unwinding
stack frames. See setjmp/longjmp.

> The underlying reason is that threads don't have their own address
> space, and thus the problem seems inherent to the notion of
> threads?!

No, the problem is inherent to skipping destructors.

"Everyone" in the threading community (including Mr. Butenhof, FWIW)
seems to agree that asynchronous thread cancellation is pretty much
untenable most of the time, and that something like exceptions that
are initiated by polling (or calling routines that are designated
"cancellation points") is the only safe way to cancel a thread.

>> > Can one compute the short-circuit parallel or with your "futures" ?!
>>
>> This is a common problem in the world of parallel computations. It's due
>> to the nature of parallel execution and it's independent from
>> boost::thread, futures or any library.
>>
>> Basically, the solution is always the same: f1() polls a flag
>> periodically, and if it is raised it exits.
>>
>> To cancel the thread you just raise the flag and the thread terminates
>> itself:
>>
>> class A
>> {
>> Flag cancel_f1;
>> bool f1(X x)
>> {
>> X * y = new X;
>> for (int i = 1; i < 1000; i++)
>> {
>> // some computations, e.g. one iteration of the algorithm
>> if (flag.is_raised())
>> {
>> delete y;
>> return false;
>> }
>> }
>> return actual_result;
>> }
>> }
>>
>> ...
>> A a;
>> a.run_f1_in_a_new_thread();
>> a.cancel_f1.raise();
>>
>
> This polling-solution doesn't look like a solution for
> what I have in mind. First of all, it's a source of
> inefficiency, but much more important, it does not work
> with generic components: The component doing some computation
> should not need to worry about the circumstances under which
> the computation is performed.

If you believe that, you can't *possibly* believe that asynchronous
cancellation is okay, because it makes the problems much worse.

> But the polling-solution forces
> each component to provide measures for its own abortion,
> and this looks like a design nightmare to me: The library
> I'm developing consists of many components solving some sort
> of "very hard computational problem" (for example AI ...).
> The components on their own as well as any combination should
> possibly run in a parallel computation. I don't believe that
> the polling-solution, effectively polluting every component,
> is suitable for these demands.

It doesn't necessarily have to pollute every component. If, like most
good libraries, your components are mostly exception-neutral, any
system call that is allowed to throw an exception can be modified to
throw cancellation exceptions, and everything will still work.

Still, the issue of exactly how thread cancellation should work in C++
is the source of much contention. You might want to read through
http://www.codesourcery.com/archives/c++-pthreads/thrd7.html#00278 and
other threads on that list.

> Altogether, it seems that threads are not suitable at all
> what I want to achieve, but *processes* are right here.

Possibly. Process cancellation is much less troublesome than thread
cancellation, because any state usually disappears along with the
process, and any broken invariants along with it. However, you still
have to watch out for shared memory.

> Perhaps I outline shortly for what purpose I need distributed
> computing:
>
> As I said, I'm developing a generic (actually, generative) library
> for solving hard problems (constraint satisfaction etc.). This
> library consists mainly of concepts and models, that is, data
> structures and algorithms. Due to the complicated problem domain,
> most of the time it's completely unknown which of the many competing
> algorithmic resources you should use. So I need to create an
> environment, where the user of the library can start several
> algorithms (whether on a single processor or using multiple
> processors doesn't matter), monitor them if needed (how much
> progressing is it making? perhaps we should abort it??), possibly
> getting a result from them, possibly aborting them. (The algorithms
> would all run independently, using their own data; they communicate
> only with the central control, and this in a quite restricted
> way. The main point is, that it's easy to run whatever I want in
> parallel and control it.)
>
> The main motivation for distributed computing (of course, if possible,
> not just on a single computer, but using the Internet ...) here is not
> just that I have multiple processors which I want to exploit, but
> equally (or even more) important is the ease with which alternative
> algorithmic resources can be exploited, possibly gaining a super-linear
> speed-up (quite common in this area --- even on a single processor
> partitioning the (large) search space into different parts and searching
> through them in parallel can be faster than doing it sequentially).
>
> Originally it seemed clear to me, that processes are the right tool here,
> while threads are not suitable. But then I didn't find any support in
> Boost for process control, only the boost thread library, and I wondered
> (hoped) that this could be used. But it seems that was just wishful
> thinking.

You might want to look at the bottom of
http://www.osl.iu.edu/research/pbgl/documentation/graph/index.html

HTH,

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk