Boost logo

Boost :

Subject: Re: [boost] Is there any interest in a library for actor programming? [preliminary submission]
From: Julian Gonggrijp (j.gonggrijp_at_[hidden])
Date: 2014-05-20 17:15:22


Matthias Vallentin wrote:

>> However, I believe the actor model is not the *best* possible approach
>> to message passing. The model is rather intricate, with monitors,
>> links, handles, timeouts, priorities, groups, and so on. To me it
>> seems a bit like the OO of concurrency: well-designed and insightful,
>> but needlessly complicated compared to a more general and powerful
>> paradigm such as generic programming.
>
> In my eyes, the well-defined failure semantics with links/monitors do
> not convolute the design, but rather make the important aspect of error
> handling explicit.

I will immediately concede that this is important, but I don't think
it is the only possible way.

> Moreover, priorities, links, monitors are all
> *opt-in* and concepts orthogonal to each other. A user can ignore them
> if desired. To stick with your analogy, it sounds to me that this
> modular behavior is what you'd expect from "concurrent generic
> programming."

Yes, I think you are right. So much for my analogy, then. Thanks for
pointing this out to me. :-)

>
>> I think the *right* design would be a concurrent equivalent of
>> generic programming, where the only fundamental building blocks
>> should be a well-designed statically typed SPSC queue, move
>> semantics, a low-level thread launching utility (such as
>> boost::thread) and a concise generic EDSL for the linking of nodes
>> with queues.
>
> The notion of *right* is very subjective, in my eyes.

Of course! No denying that.

> For example, I
> personally don't want threads to be the concurrency building block in my
> application. I would like to run as many threads as I have cores on my
> machine, and a scheduler that maps logical tasks to a thread pool.
> Today, a thread is what C++ programmers choose as concurrency primitive.
> But it's a hardware abstraction and does not scale. (You cannot spawn
> millions of threads efficiently.) Your application may offer a much
> higher degree of logical parallelism, for whatever notion of task you
> choose.

In the approach I proposed threads would be fundamental building
blocks of the framework, but they do not need to be building blocks
in your application. In fact, there is a fairly straightforward way
to implement a worker pool with a scheduler as an abstraction on top
of the fundamental building blocks. Your application could create the
same network of nodes and queues and feed it into the abstraction of
the scheduled worker pool instead of directly into a thread launcher,
or even take a hybrid approach.

>
> We have to start appreciating that other languages have had tremendous
> success with the actor model. Skala/Akka, Clojure, Erlang,

I do appreciate that! In fact this is the main reason I believe the
actor model is *good*, and learning about Erlang and the actor model
caused me to look into SPSC queues. I just think it is possible to do
*even better*.

> all show that
> the this is an industrial-strength abstraction of not only concurrency
> but also network transparency. (When programming for cloud/cluster
> applications, one has to consider the latter; see below.)
>
>> start(readfile(input) | runlengthenc | huffmanenc | writefile(output));
>
> You describe a classic pipes-and-filters notion of concurrency here,
> where presumably you'd expect your data to flow asynchronously through
> the filters. Effectively, this is just syntactic sugar for message
> passing, where nodes represent actors taking one type of message,
> transforming it, and spitting out another (except for the sink).
> Such an EDSL is orthogonal to the underlying mechanism for message
> passing.

All true, the same syntax could be an interface to an actor-based
framework. The syntactical interface by itself is important, though.

> [...]
>
>> I'm a bit skeptical about the necessity and usefulness of built-in
>> network transparency, but you might be able to convince me that it
>> needs to be there.
>
> I feel quite the opposite: network transparency is an essential aspect
> of any message passing abstraction. When developing cluster-scale
> applications, I would like to write my application logic once and
> consider deployment an orthogonal problem. Wiring components without
> needing to touch the implementation is a *huge* advantage. It enables
> implementing complex and dynamic behaviors of distributed systems, for
> example spawn new nodes if the system sense a compute bottleneck.

In other words, it is very powerful to work with nodes/workers/actors
without needing to know whether they are on the same processor or a
remote one. I understand this and I agree that network transparency
has value. What I'm rather skeptical about is that it needs to be
built-in by default; I would prefer it to be opt-in.

Cheers,
Julian


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk