Boost logo

Boost :

Subject: Re: [boost] [modularization] What is a module? What is a sub-module?
From: Bjørn Roald (bjorn_at_[hidden])
Date: 2014-09-21 11:12:51


On 09/21/2014 10:36 AM, Vicente J. Botet Escriba wrote:
> Hi all,
>
> After the long threads concerning the modularization it seems clear to
> me that we are in an impasse.

Maybe most of the friction is more of a case of lack of clear
communication rather than real disagreements. It could be the goals
would be agreed if they where clear to everyone. Some participants in
the threads seems to have clear goals in mind for what need to be done
first, and just feel need to to proceed, while others are confused about
what is going on and why. The later may need to understand the "why" as
in how we get to a end result we want and what that result looks like.
The former group may be more concerned with what they "know" has to be
done before we get anywhere. They need to convince the skeptics why
that is the case. Neither sides statements and arguments are hard to
understand if you are willing to try to shift mindset for the sake of
understanding. Nevertheless it need to be some level of consensus before
this can proceed.

So how can consensus be achieved? I think starting with more concrete
meaning to terminology used in discussions, proposals and guidelines
would be a very helpful. Guessing what people mean with module,
sub-module, library, sub-library, repo, sub-repo, package, dependency,
etc. is not helpful to understanding each other. I have tried to follow
the discussions and have to say misinterpretations seems to be a major
problem. If we could agree on terminology, then the quality of the
discussions could be improved vastly. A wiki page defining how these
terms are used and not used in boost could be a normative reference. As
I have been thinking about this a bit, I offer my thoughts here for
comments and elaboration. There are certainly definitions here I am not
sure are the best or the right ones, however I opt for not providing the
alternatives I have been considering and pros and cons for each as I
rather provide a cleaner proposal for discussions, here we go:

Library:
A library is a collection of code in Boost that is reviewed and
accepted/rejected by boost as community. A library is maintained be
individuals that are the library maintainers. The code is managed in a
separate git repository that is included as a git submodule in the libs
folder of the boost master repository. A library contain the library's
main module in subdirectories include, src, test, build, and doc. In
addition a library may contain a number of additional directories
containing optional modules that depend on the main module, these are
called sub-libraries.

Sub-library:
A library may contain related code in sub-libraries that should be
treated as separate module to limit dependencies incurred if they are
part of the library's main module. The sub-library has its own module
structure containing its own include, src, test, build, and doc
directories. A sub-library is part of the library and is maintained by
the libraries maintainers.

Package:
Unit of deployment of boost source code and/or pre-build libraries,
documentation etc. Typically there may be a one-to-one relationship
between packages and modules, but it is possible to deploy more than one
module in a package or break one module into more than one package.

Repository:
A version controlled directory structure containing checked out or
modified files in a working directory and a database of the repository
history and relationships to other repositories. In a git working
directory, the database is in the .git subdirectory or is pointed to by
a .git file.

Sub-Repository:
I suggest we do not use this term mean sub-library. Use the term
sub-library or git submodule instead.

Module:
A organized set of boost library code that can be handled in a uniform
manner by boost tools. A module shall contain the include, test, build,
and doc directory, Modules that are not header-only shall also contain
the src directory that is used to build one or more corresponding
library files.

Sub-module:
I suggest we do not use this term to mean sub-library, use sub-library
instead. If it is not clearly given by context, use git submodule if we
have a git repository tracked using a git submodule in mind
(http://git-scm.com/docs/git-submodule).

Dependencies:
Handling of dependencies is where I struggle the most with seeing a
clear path forward. In particular what determines the nodes and edges
in the dependency graphs we care about. And what are we going to use
the dependency graph for. The naive approach is to track module
dependencies alone. That is, each module is a node in a dependency
graph. This does however have some major problems.

Test Example, and Doc Dependencies:
First of all, if test, example and doc code is part of the module and
incur additional requirements, we certainly do not always want to track
those dependencies as the modules dependencies. A separate dependency
graph node for test code seems to be a solution if there is a real need
to track it at all. Documentation can also clearly be treated separate
if need be. However, given this, then the module as defined above is no
longer the node in the dependency graph. But that is probably just the
beginning.

Lib Dependencies:
Modules that are not header only have source files in the src directory
that are compiled into one or more library files (ignoring variants
directly supported by Boost.Build). Separate dependency graph nodes may
be appropriate here to distinguish dependencies at link and compile
time. But there are many possible facets of this, so I think the real
use-cases for the dependency graph should drive requirements for what
the nodes and edges shall model. In addition dependencies may vary on
configuration of the target environment. It is not clear if or how such
external dependencies should be tracked, however starting with the
Jamfile lib dependencies is certainly a good start. It may be most
package management systems has what is needed for the rest, so it is a
mater of bridging these worlds.

Include Dependencies:
Dependencies in the include directory may cause compile and link time
dependencies for the module user. These dependencies does not incur
before a header is included directly or indirectly that require the
specific dependency to be met. This could, as some have pointed out,
be leveraged to get very flexible and fine-grained "real" dependency
graph in boost. However, as the actual dependencies are not known
before the application developer changes source code, compiles and
links, and then understand cause of the resulting diagnostics, this is
not very helpful for packaging of minimum required sub-sets of boost. I
am also afraid the diagnostics for missing headers or object file
symbols will not be a very user friendly solution. However if that
could be fixed somehow to point directly at the missing package, or even
better that a package manager could be more or less automatically
invoked to fix it, then this may be a path forward. Such fine-grained
dependency tracking could greatly reduce need for sub-libraries.
Separating larger chunks of code in a sub-library may seem reasonable
for several reasons, but to separate single headers into their own
sub-library only to get a "pretty" graph may clearly be way off the
reasonableness scale. Especially, if it can be reasoned that we don't
push internal boost structure problems on the helpless application
developer to figure out. It seems reasonable to look for facilitation
for something much simpler in these cases. For the lack of a better
term for what some are suggesting, I just invented bridging-header as a
term which may be a mechanism to help in this situations.

Bridging Headers:
A bridging header is a C++ header files that bridges facilities in one
module with facilities in another module to provide a new convenience
facility to users. The bridging header is part of the include structure
in one of the two modules and only depend on a minimal required set of
features from the two modules to provide the new convenience facility.
A bridging header is marked in a to-be-determined way that allow
dependency tracking tools to track the set of bridging headers between
any two modules as a separate node (a bridge) in the dependency graph.
When a user include a bridging header it add both the bridged modules as
dependencies, however it may not be practical to have every bridge
tracked by a package manager as a separate package.

--
Bjørn

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk