Boost logo

Boost Users :

From: Doug Gregor (dgregor_at_[hidden])
Date: 2006-09-29 10:59:23


Hello Levent,

[Matthias: There's a question for you below, about the skeleton/
content mechanism]

On Sep 25, 2006, at 7:17 PM, Levent Yilmaz wrote:
> I wonder why this Port abstraction (or a similar idea) couldn't
> make its
> way to Boost.MPI? Is there an underlying design decision to this?

I view ports as a useful feature that can easily be added on top of
the current Boost.MPI interface. You can expect ports to be in an
upcoming version of Boost.MPI because, as you've shown, they
certainly simplify many uses of MPI.

> Also, why the P2P commands are members of the communicator class
> but RC
> commands such as broadcast and gather are global?

The motivation behind this decision was to keep the
"algorithms" (collectives) separate from the "core" operations (P2P
with a communicator). It's a philosophical viewport with a practical
point to it... #including all of the algorithms pulls in a lot more
code than including the communicator alone. But, perhaps we should
revisit this decision.

Perhaps it's unfamiliarity, but I find

        world[0].Bcast(data)

to be a little strange. I know it means that we're broadcasting from
rank 0, but it reads awkwardly for me because it's a receive on any
other rank. Ports feel like a great interface for point-to-point, but
they don't seem to convey the "collective" nature that free-standing
algorithms do.

Another advantage of free functions for the algorithms (which we
don't currently realize) is that we could have versions that don't
take a communicator argument at all. Many simple applications just
use MPI_COMM_WORLD anyway.

>
> *Message object*
>
> A message in MPI is created by specifiying a data type, data count,
> data
> address and finally a message tag. Note the C interface of passing a
> message in a P2P communication (see first example above). The 4 out
> of 6
> arguments in MPI_Send() is just to specify the message. And for every
> communication statement of the same kind, this list should be
> repeated.
>
> Boost.MPI (ignoring the skeleton/content idea for a second) reduces
> this
> to three arguments (http://tinyurl.com/k2w7q): Reference to the data
> (from which the type is deduced), data count and the message tag.
> However, this message is still not a separate entity; and all this
> information is repeatedly specified as arguments to communication
> functions.
>
> OOMPI provides the convenience of OOMPI_Message object, which is then
> used in communications. For instance, to pass an array of integers:
>
> //=============================================
> int size = 5;
> int arr[size];
> OOMPI_Tag msg_tag = 10;
> OOMPI_Message msg(arr1,size,msg_tag);
>
> if (world.Rank() == root.Rank()) /* fill up the arr */;
> root.Bcast(msg);
> //=============================================

I guess in the context of Boost.MPI we would parameterize the message
on the data type (to avoid having to dispatch through virtual
functions), e.g.,

        boost::message<std::vector<int> > msg(vec, msg_tag);

I guess I never really grasped the advantage of the message
abstraction. It does tie the data to the tag, but since we rarely
have size > 1, it doesn't seem to be a big win for Boost.MPI.

>
> Now, Boost.MPI conveniently reduces the data type definition down to a
> single statement:
>
> mpi::skeleton(mylist);
>
> But, then the example has this extra broadcasting statement
> (indeed, the
> example is such that the skeleton (MPI_Datatype) is created as a
> temporary):
>
> broadcast(world, mpi::skeleton(mylist), 0);
>
> Here, I am not so sure why this is useful. The only thing that is
> common
> with the master and slave processes' lists is the list_len, which is
> already assumed to be known by all.

I need to clear up this example. We're not assuming that list_len is
known by all. Rather, the master has some kind of data structure that
only it knows about. It broadcasts the skeleton---or, shape---of that
data structure so that everyone else can build identically-shaped
containers. The content is passed separately.

> The containers live in totally
> distinct memory spaces and the elements are located at different
> addresses. So everybody has to call mpi::skeleton(mylist) separately,
> but not _receive_ it from the master.
>
> The actual data is passed by subsequent possibly multiple broadcast
> calls:
>
> mpi::content c = mpi::get_content(l);
> broadcast(world, c, root);
>
> Now, here is the content anything more than MPI_BOTTOM? What extra
> information does it carry?

Content is merely MPI_BOTTOM, with an appropriate data type. The
skeleton/content mechanism is meant to abstract this use of MPI_BOTTOM.

> Is it possible to apply the same skeleton to
> a different list<int> of same size?

Receiving a skeleton into a list<int> reshapes that list<int> to
mirror the shape of the list<int> that was transmitted.

> Another point, as I indicated before with the C example, is it
> possible
> to create skeletons of different containers of the same value_type
> (say
> list<int> on master and vector<int>'s on slaves) and pass the contents
> around without performance penalty?

I don't believe this is possible, but let's see what Matthias has to
say. He's the expert on skeleton/content.

>
> *MPI-2 Bindings*
>
> Somewhere in the documentation a reference to MPI-2 bindings has been
> made (http://tinyurl.com/jhjo8). Including modern MPI-2 bindings
> are of
> course an important provision; however as I indicated previously,
> there
> are certain supercomputing centers which do not support MPI-2 yet.
> Does
> this resctrict the usage of Boost.MPI in anyway? In other words, would
> my code still compile and link if I avoid the MPI-2 subset of
> Boost.MPI?

Yes, it will. Boost.MPI will be compatible with MPI 1.1 for the
foreseeable future.

At present, the only part of Boost.MPI that relies on MPI-2 is the
allocator. It will only be available in Boost.MPI if we have
determined that the underlying MPI provides MPI_Alloc_mem and
MPI_Free_mem.

If we start introducing abstractions for MPI-2 in Boost.MPI (we'd
like to; time is a big issue), they'll only be available when the
underlying C MPI supports the necessary MPI-2 functionality.

> Also, as indicated by the developers themselves, the current
> implementation supports only a limited subset of MPI 1.1. Yet, IMHO,
> given this subset doesn't even include the virtual topology
> creation, it
> is perhaps a bit too early to have it in boost.

Right. We're missing groups, inter-communication, and topologies. As
with MPI-2, we'd like to provide complete support, but in this case I
(personally) do not have suitable experience with these features
(especially topologies) to be certain that we're getting the
abstractions right.

In all fairness, our limited subset is the subset of MPI used by a
large number of people. Many of us don't need intercommunicators or
virtual topologies, so the library itself can be very useful even
before it is complete. That doesn't mean we won't complete it, of
course. We want to.

> Well, thanks for bearing with me this far. I hope I was able to
> provide
> useful feedback for this fresh library.

Thanks for the feedback!

        Cheers,
        Doug


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net