Boost logo

Boost :

Subject: Re: [boost] [modularization] Extract xml_archive from serialization
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2014-09-16 14:09:39


On Tuesday 16 September 2014 09:42:25 Robert Ramey wrote:
>
> I think the notion of "dependency" is richer than can be captured in this
> sort of graph. So it can't
> be understood in terms of this graph alone. I've written about this in the
> past - my maybe my post
> was lost due to google forum issues. For anyone who's interested here it is
> again.
>
> Consider another simple case - date time/serialization.hpp
>
> most date/time users don't use this - but a few do. Is serialization a
> prerequisite for date/time? which users are we talking about? One can't
> win here. If you distribute serialization with every use of date/time
> you're distributing too much. If you don't, you'll be failing to ship
> functionality which some users need. What is the solution here - make two
> libraries out of date/time? or what?

The solution will be to separate the dependency on Serialization into an
optional component. This can be a header or a git submodule or a sublib in
DateTime or something else. What exactly this is is defined by a number of
aspects, including maintenance convenience, access control, distribution and
deployment infrastructure. I agree that many of these aspects are not defined
at the moment, but from the perspective of maintenance, access permissions and
modularization effort a sublib looks most feasible to me.

> Suppose I have a simple application A which uses the text_archive and only
> serializable types defined within the application itself. It should be
> clear that I can ship that application without shipping any of the libraries
> or code in ../serialization/variant.hpp etc..., xml_archive etc... So one
> can say that A is not dependent upon anything other than the serialization
> library. So, at least for this application, the dependency graph referred
> to above is not a good indicator of what I have to ship with my app. In
> fact, it's misleading.

In an ideal world you could distribute your application with the subset of
Boost on per-header basis. But I think this task is not realistic at the
current stage - mostly because it's difficult to correctly discover all
possible dependencies on per-header basis.

At this point the most reasonable level of dependency tracking is per-library
or per-sub-library. It is not optimal in that it can add dependencies you
don't actually need, but it's certainly better than the monolithic Boost.
Returning to your example, the application will pull Serialization and
everything it depends on, unless you extract the optional bits to sublibs or
make them optional otherwise.

> A little reflection reveals why this is so. The graph is generated by
> considering what it takes to build the serialization DLL and/or LIB which
> includes all the archive classes and perhaps a bunch more stuff.
>
> So the graph tells us something, but what?
>
> The serialization library has several classes of components
>
> a) library core - implements common code to all serialization/archives
> b) particular archive implementations, xml_archive, ...
> dependencies according to the particular archive type being used or
> built
> c) serialization of other library components - e.g. shared_ptr - which
> depends on share_ptr itself.

These are probably the best candidates for separating from the core.

> d) the test suite - which depends on all the archives being tested - which
> is the boost build default usage
> e) examples - will depend only on a small part of the serialization library.

Tests and examples typically use more components than the library itself (at
least, most tests need some testing library or infrastructure). For this
reason I consider them as a special kind of sublibs, in the sense that they
are optional, and you would have to explicitly install them so that their
dependencies are pulled. When you only need the library itself, you don't have
to install dependencies of its tests and examples.

> Now if you wanted o make a series of graphs like:
> a) particular archives text_archive, ...
> b) serialization for each included type e.g variant
> c) all tests, or each subset per archive
> d) examples
> e) other libraries such as date/time which use the serialization library in
> some its applications and test but
> not in others.
>
> You'd have something more accurate - but alas - more complex to interpret
> and hence less useful.

The reports Peter publish show the library headers dependencies - which is our
main concern now and is enough to work on the current stage. A more accurate
report would also include dependencies needed to build library from sources
(i.e. the dependencies of src/*).

The dependencies of tests and examples are not the issue now, but they will be
when we have a deployment tool. But if we can track dependencies between
libraries, I don't see the problem doing the same for tests and examples.

> If I had more time, I might be able to make this argument more coherent and
> tighter. Sorry about that.
>
> But the real questions are:
> a) what do we want modularization to accomplish and is this a feasible goal.

Being able to download and install a subset of Boost.

> b) Do we want to obsolete the original concept of equivalency between
> module and developer responsibility?

I don't think we're doing this. At least, not so far.

> c) Do we want to support deployment of boost subset? I think we do.

I think too.

> d) How should such a subset be defined - via BCP or some boost build
> dependency.

The instrumental question is important, and there's no definitive answer yet.
Mostly because there are no prototypes, so there's nothing to choose from. I
remember only one proposal that was discussed on this list, and it wasn't
Boost.Build. Currently, boostdep is used to track dependencies and generate
reports, but there's no modularized deployment tool.

> e) How fine grain should such a dependency measured. Does importing one
> header - makes the whole other library a prerequisite or just that header
> and associated *.cpp.

At this point on library/sublib level. I don't think header level is feasible
at this point, but it may be in future.

All the above is my opinion and understanding, of course.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk