Boost logo

Boost :

Subject: Re: [boost] [modularization] Extract xml_archive from serialization
From: Robert Ramey (ramey_at_[hidden])
Date: 2014-09-16 12:42:25


Stephen Kelly-2 wrote
> Hi there,
>
> The biggest single problem of coupling in boost comes from the spirit
> dependency in the serialization library. This makes serialization itself
> very (and needlessly) heavy:
>
> http://www.steveire.com/boost/2014sept16_serialization.png
>
> Spirit is used only by the xml archiving classes.
>
> I recommend extracting an xml_archive library from serialization. That
> way,
> serialization no longer depends on spirit, which is already an
> improvement:
>
>
> http://www.steveire.com/boost/2014sept16_serialization-after-extract-xml_archive.png
>
> Further, the serialization classes for boost::variant and boost::array are
> in the serialization library. This is not appropriate, as the
> serialization
> classes for all other types are in the libraries providing the types. Move
> the serialization classes to those libraries.
>
> mkdir ../variant/include/boost/serialization
> mv include/boost/serialization/variant.hpp
> ../variant/include/boost/serialization
>
> mkdir ../array/include/boost/serialization
> mv include/boost/serialization/array.hpp
> ../array/include/boost/serialization
>
> This is a large improvement:
>
>
> http://www.steveire.com/boost/2014sept16_serialization-after-type-move.png
>
> There is more that can be done. These things can be done now. I recommend
> doing them.
>
> Thanks,
>
> Steve.

I think the notion of "dependency" is richer than can be captured in this
sort of graph. So it can't
be understood in terms of this graph alone. I've written about this in the
past - my maybe my post
was lost due to google forum issues. For anyone who's interested here it is
again.

Consider another simple case - date time/serialization.hpp

most date/time users don't use this - but a few do. Is serialization a
prerequisite for date/time? which users are we talking about? One can't
win here. If you distribute serialization with every use of date/time
you're distributing too much. If you don't, you'll be failing to ship
functionality which some users need. What is the solution here - make two
libraries out of date/time? or what?

Suppose I have a simple application A which uses the text_archive and only
serializable types defined within the application itself. It should be
clear that I can ship that application without shipping any of the libraries
or code in ../serialization/variant.hpp etc..., xml_archive etc... So one
can say that A is not dependent upon anything other than the serialization
library. So, at least for this application, the dependency graph referred
to above is not a good indicator of what I have to ship with my app. In
fact, it's misleading.

A little reflection reveals why this is so. The graph is generated by
considering what it takes to build the serialization DLL and/or LIB which
includes all the archive classes and perhaps a bunch more stuff.

So the graph tells us something, but what?

The serialization library has several classes of components

a) library core - implements common code to all serialization/archives
b) particular archive implementations, xml_archive, ...
    dependencies according to the particular archive type being used or
built
c) serialization of other library components - e.g. shared_ptr - which
depends on share_ptr itself.
d) the test suite - which depends on all the archives being tested - which
is the boost build default usage
e) examples - will depend only on a small part of the serialization library.

Now if you wanted o make a series of graphs like:
a) particular archives text_archive, ...
b) serialization for each included type e.g variant
c) all tests, or each subset per archive
d) examples
e) other libraries such as date/time which use the serialization library in
some its applications and test but
not in others.

You'd have something more accurate - but alas - more complex to interpret
and hence less useful.

So - the degree of "modularization" cannot be determined or illustrated or
measured by examining the graph above.

The question has to be couched in more concrete terms:

a) if I want to distribute the components required to build some particular
app? I don't know if we have a tool to do this, but both boost build and
CMake will build only the components required to build the app.

b) If I want to distribute the components required to build any application
which might use any of the interface in the serialization library - I'll be
distributing a lot - as you point out above.

etc.

So, taken to it's logical conclusion, extracting xml_archive would lead to
extracting other components as well. One or more for each of the classes a-e
listed above. I'm not sure we want to do this.

Traditionally, (and not just in boost) libraries have been organized around
developer responsibility. This has more or less paralleled "dependency".
"dependency" has also been fostered by incremental addition to the library
set.

If I had more time, I might be able to make this argument more coherent and
tighter. Sorry about that.

But the real questions are:
a) what do we want modularization to accomplish and is this a feasible goal.
b) Do we want to obsolete the original concept of equivalency between module
and developer responsibility?
c) Do we want to support deployment of boost subset? I think we do.
d) How should such a subset be defined - via BCP or some boost build
dependency.
e) How fine grain should such a dependency measured. Does importing one
header - makes the whole other library a prerequisite or just that header
and associated *.cpp.

My basic point is that these questions have to be addressed before the
notion of decoupling can be carried much further.

In concrete terms - the exclusion of xml_archive should be:
a) dropped altogether - (find by me btw)
b) created as a separate library module
c) not included in builds that don't require it? note that boost build
already do this.
d) or should the whole serialization library be subdivided in to a "library
group" based on consideration of the classes above. (lol - never going to
happen)

I'm concerned that the movement to diminish module dependencies is failing
to take into account the above considerations. At least I don't recall
seeing these considerations explicitly addressed.

Robert Ramey

--
View this message in context: http://boost.2283326.n4.nabble.com/modularization-Extract-xml-archive-from-serialization-tp4667615p4667625.html
Sent from the Boost - Dev mailing list archive at Nabble.com.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk