Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2006-09-17 13:56:46


This isn't a full review - I've just read the documentation and perused the
code. Its more like random observations. A lot of this is very particular
to the way that the MPI library uses/builds upon serialization. So it may
not be of interested to many.

I've spent some more time reviewing the serialization currently checked into
the head. It's quite hard to follow.
Among other things, the MPI library includes the following:

a) an optimization for serialization of std::vector, std::valarray and
native C++ arrays in binary archives.

b) A new type of archive (which should be called mpi_?archive) which
serializes C++ structures in terms of MPI datatypes. This would complement
the archive types that are already included in the package.

i) text - renders C++ structures in terms of a long string of characters -
the simplest portable method.
ii) binary - renders C++ structures as native binary data. The fastest
method - but non portable.
iii) renders ... as xml elements - a special case of i) above.

So we would end up with an mpi_archive and optionally mpi_primitive. In the
other archives, I separated ?_primitive so this could be shared by both text
and xml. In your case it isn't necessary to make an mpi_primitive class -
though it might be helpful and it would certainly be convenient to leverage
on the established pattern to ease understanding for the casual reader.

c) the "skeleton" idea - which I still haven't totally figured out yet. I
believe I would characterize this as an "archive adaptor" which changes the
behavior of any the archive class to which it is applied. In this way it is
similar to the "polymorphic_?archive" .

In my view these enhancements are each independent of one another. This is
not reflected in the current implementation. I would suggest the following:

a) enhancements to the binary archive be handled as such. We're only talking
about specializations for three templates - std::vector, std:valarray and
native C++ arrays. I know these same three are also handled specially for
mpi_?archives, but it's still a mistake to combine them. in binary_?archive
they are handled one (load binary) while in mpi_archive they are handled
another (load_array) I still think this would be best implemented as
"enhanced_binary_?archive".

b) mpi_?archive should derive directly from common_?archive like
basic_binary_?archive does. The reason I have basic_... is that for xml and
text there are separate wide character versions so I wanted to factor out
the commonality. In your case, I don't think that's necessary so I would
expect your hierarchy would look like
class mpi_archive :
public common_archive,
public interface_archive
...
I doubt it even has to be a template. It would
1) render the native archive types (class_id, etc) as small integers - like
the binary archive currently does.
2) render C++ primitives (and std::string) as corresponding MPI datatypes
3) handle the special implementations for C++ native arrays, std::vector and
std::valarray
Note that you've used packed_archive - I would use mpi_archive instead. I
think this is a better description of what it is.
Really its only a name change - and "packed archive" is already inside an
mpi namespace so its not a huge issue. BUT I'm wondering if the idea of
rendering C++ data structures as MPI primitives should be more orthogonal to
MPI prototcol itself. That is, might it not be sometimes convenient to save
such serializations to disk? Wouldn' this provide a portable binary format
for free? (Lots of people have asked for this but no one as been
sufficiently interested to actually invest the required effort).
4) Shouldn't there be a logical place for other archive types for message
passing - how about XDR? I would think it would be close cousin to MPI
archives.

c) The skeleton idea would be
template<class BaseArchive>
class skeleton_archive
....???
(I concede I haven't studied this enough).
This would be coded as an "archive adaptor" (as is polymorphic archive) as
described in a discussion thread some months ago. The concept of
the"skeleton" seems very interesting but really orthogonal to any particular
type of archive. Perhaps the
skeleton idea would be useful to other types of data renderings. By making
it as an archive adaptor, its facility could be added to any existing
archive. Even if not useful anywhere else, it would help comprehensability
and testability to factor it out in this way.

So rather or in addtion to an MPI library you would end up with three
logically distinct things. Each one can stand on its own.
The only "repeated" or shared code might be that which determines when
either a binary or mpi optimization can be applied. It's not clear to me
whether this criteria applies to both kinds of archives ore each one has its
own separate criteria. If it's the latter - there's no shared code and we're
done. If it's the former, the a separate free standing concept has to be
invented. In the past I've called this "binary serializable" and more lately
"magic". ( a concession to physicist's fondness for whimsical names).

So depending on this last, the serialization part of the MPI library falls
into 3 or 4 independent pieces. If the code where shuffled around to reflect
this, it would be much easier to use, test, verify, enhance and understand.
Also the skeleton concept might be then applicable to other types of
archives. Also the "magic" concept really is a feature of the type and is
really part of the ad hoc C++ type reflection which is what serialization
traits are.

So, that's my assessment.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk