Boost logo

Boost-Build :

From: Ali Azarbayejani (ali_at_[hidden])
Date: 2003-05-09 17:22:22


David and I have been discussing a number of issues relating to
refactoring the BBv2 codebase and we thought it would be a good time
to summarize some of them so you can see what we're planning and have
an opportunity for early feedback.

o Dynamic Type Checking

A major problem for comprehension throughout is lack of variable
types and inconsistent naming. For example, a "property set" can
take three forms: a list, a path, or an instance of
object(property-set). Functions that take specific forms usually
use "properties" or "property-set" as the argument name, often
resulting in some confusion over which form the argument should be.
(See, e.g. feature.split, property.as-path).

Another related problem is that arguments are generally named to
reflect the *type* of the argument, often resulting in confusion as
to the *role* of the argument.

The proposal is to introduce dynamic type checking. David can give
more details, but the resulting syntax will allow the confusing

rule split ( property-set )
rule as-path ( properties * : feature-space ? )

to be written more explicitly as something like

rule <property-list> path-to-list ( <property-path> input-path )
rule <property-path> list-to-path ( <property-list> input-list :
feature-space ? )

or

rule [property-list] path-to-list ( [property-path] input-path )
rule [property-path] list-to-path ( [property-list] input-list :
feature-space ? )

(David likes the square brackets because they don't require a shift.
I don't care about the shift, very slightly prefer the angle
brackets.)

(Note: I changed "feature.split" to "path-to-list" and
"property.as-path" to "list-to-path" to emphasize the complementary
nature of the two functions. More on feature/property refactoring
see below.)

This type of readability problem is widespread throughout the
system. The new syntax is backward compatible (arguments and rules
do not require types) and the syntax can be introduced for better
readability before actually implementing the dynamic type checking.
The dynamic type checking would work by registering a type-checking
function for each type. This would be done automatically for
classes. The actual type checking could be turned off to avoid any
possible performance hit.

o Explicit Importing

Currently "import foo" in a module doesn't guarantee that the module
is dependent upon foo and the absence of "import foo" doesn't
guarantee that the module is NOT dependent on foo. The latter is
because "import foo" in another module makes foo rules available
globally.

This presents a problem for determining which modules depend on
which other modules.

The solution is to modify "modules.import" to make foo rules
available only to the module that is importing them.

This would allow a better visual understanding each module through
the import statements; would allow automatic determination of module
layering/dependency relationships, leading possibly to automatic
documentation of layering and automatic maintenance or enforcement
of layering and dependencies; and would reduce sloppy and
inconsistent coding practices.

o Class Declaration

Each class declaration requires a "rule" declaration followed by a
"class" declaration, making it difficult to see at a glance whether
a rule is a regular rule or a class definition. Short of a core jam
modification to allow a more obvious declaration, it may be possible
to reduce confusion by allowing the placing of the "class"
declaration just before the "rule" declaration. David plans to
explore this possibility. We're not sure there is an easy solution.

Also, see "Refactoring Classes" below. Putting each class in its
own file reduces the confusion as well.

o Option plug-ins

Module "bootstrap.jam" is conceptually low-level, but depends upon
"doc.jam", which results in a great deal of unwanted dependency on
text-processing modules. This is a minor architectural problem.

Actually, bootstrap.jam depends only on the parsing of command line
options representing help requests. One way to decouple bootstrap
from doc and provide a general mechanism for extending command line
options is to provide a plug-in architecture for command-line
options. The "help" module would use the option plug-in technique
to express that it wants to handle certain command line options and
then exit. Other modules might do the same, or handle command line
options and continue. Thus, doc is decoupled from bootstrap and we
have a new command-line-option extension feature.

o Layers (tools, build, core)

A coarse layering of the 55 current .jam files results in the
following three conceptually and structurally useful layers. Note,
however, that refactoring will result in creating, deleting, and
renaming modules. Some current modules contain material that
crosses layers and needs to be separated somehow.

- tools ---------------------------------------------------------

borland, gcc, kylix, msvc, stlport, bison, lex, qt, boostbook,
doxygen, fop, xsltproc, alias, make, stage, symlink, builtin

- build ---------------------------------------------------------

build-request, build-system, version, project, project-root,
targets, prebuilt, modifiers, generators, virtual-target,
scanner, toolset, type, property-set, property, feature

- core ----------------------------------------------------------

bootstrap, doc, class, container, common, set, utility, assert,
print, os, path, modules, errors, sequence, regex, string,
numbers

Other: testing, test, boost-build, user-config, site-config

The "core" layer is a library of bootstrap and core "BBv2 language"
constructs and utilities. The "build" layer contains the
generalized build system logic, with no specific knowledge of how to
build anything in particular. The "tools" layer contains specific
functionality of tools. (Note that "builtin" does not really need
to be built in to the "build" layer...it basically consists of
generalized compiler and linker tool functionality and should
therefore be in the "tools" layer...we plan to refactor this into
modules called perhaps "compiler" and "linker".)

In general,

o BBv2 users interact only with the "tools" layer,
o BBv2 extenders interact with the "build" layer, but typically
only a few modules are meant for normal extensions.
o only BBv2 developers interact with the "core" layer,

There is a strict dependency relationship: each layer depends only
upon lower layers: tools->build->core.

Note, however, that within the "core" layer specifically, there is
no reasonable way to cleanly layer modules within the layer. For
example, modules->assert,errors and assert->modules,
errors->modules...these circular module dependencies are unavoidable
(in any reasonable manner). However, the modules within the "build"
layer can be (with some re-factoring) made to be strictly layered
and this is a goal of the refactoring. Generally, the modules in
"tools" are independent resulting in a flat module layering in
"tools".

We propose eventual re-structuring of the BBv2 directory to have
three main subdirectories "tools", "build", and "core" containing
the functionality described above, after refactoring. This would
result in import statements like:

import tools/gcc ;
import build/type ;
import core/os ;

We realize the proposed names result in source-tree paths like
"boost/tools/build/tools" and "boost/tools/build/build", which is
kind of ugly. Any suggestions?

o Refactoring feature.jam, property.jam

We spent a lot of time specifically on modules "feature",
"property", and "property-set", as they are the lowest-level modules
in the "build" layer and the right place to start. Some
observations include

- There is no real separation between the concept of "feature" and
the concept of "property". Features exist for the purpose of
properties. Thus the artificial division between modules "feature"
and "property", whatever the original motivation, doesn't really
work, is slightly confusing or misleading, and should be eliminated.

- However, there does seem to be a need for more than one module for
the feature/property content. Module "features" should contain the
core concepts and simple manipulations of single features,
subfeatures, implicit features, single properties, and property
lists. A second module called something with the meaning
"feature-processing" might be introduced to contain some of the
higher-level manipulations like "minimize", conditional properties,
and most of the other contents of the current "property" module.
Any suggestions on naming this second, higher-level module? A good
name might arise after the functionality has been factored.

- Class "property-map" should be in its own file.

- The introduction of class "property-set" at some point in the
development seems to have created some confusion with terminology in
earlier-written code, but it seems as though it serves two necessary
purposes: (1) provide a single object for manipulating what would
otherwise be a list and (2) caching results of certain operations
for performance reasons. Volodya, perhaps you can confirm that
these are the reasons for its existence? This module should
basically stay as it is.

- Throughout these three modules and BBv2 in general there is much
inconsistency in terminology and naming. The Dynamic Type Checking
proposal above will allow us to write clearer code. Some
refactoring will re-group functions in a more meaningful way and
re-name them. For example, to convert a property-list to a
property-path you call "property.as-path", but to do the opposite,
you call "feature.split"; these functions should be in the same
module and named in a way that expresses their complementary nature.

o Refactoring classes

We propose that each class have its own file.

This will have implications for module dependencies, because current
implementations are rather sloppy about what depends on what.
Classes depend on functions in the module they exist in and other
functions in the module depend on the class. These circular
dependencies are probably not necessary. Re-factoring will
highlight more clearly where circular dependencies exist. Trying to
eliminate circular dependencies will probably lead to a clearer
design and better concepts.

(N.B. Jam/BBv2 has no problem handling circular dependencies, even
though humans do, when trying to learn, understand, maintain, or
extend the codebase. We can therefore make a tradeoff between what
circular dependencies to eliminate and which to keep...it's not an
either/or proposition. Personally, I would like to eliminate all
circular modular dependencies above the "core" layer...don't know
yet whether that is reasonable.)

o "Abstract Targets" vs. "Virtual Targets"

One of the most confusing aspects of the system is the existence of
"abstract-targets", which "generate" "virtual-targets", which
"actualize" "Jam targets". There are class heirarchies for
"abstract-targets" and "virtual-targets".

Our observations and recommendations:

- Eliminate the "abstract-target" base class and resulting
heirarchy. The derived classes ("project-target", "main-target",
"basic-target") are not really related and are never used
polymorphically.

- Rename "project-target" -> something like "project-spec" or simply
"project". Possibly absorb all the functionality of existing
modules "project" and "project-root" into this class...they are
circularly dependent anyway!

- Rename "main-target" -> something like "main-target-spec" ??? It
doesn't actually represent a "target"...it respresents a set of
formulas, any one or more of which can be used to create variants of
a target with this id. Anyway, it is a user-level concept and
confusing with respect to internal notions of "target".

- Rename "basic-target" -> something like "target-spec". Again, not
a target but a formula.

- Eliminate or rename "typed-target". Pending David's review of the
"generate" process.

- Rename "virtual-target" -> "target". These are the real objects
that are associated directly with targets.

We are pretty sure about this high-level structural analysis, but
uncertain on the exact names. Also, this whole scheme interacts
intimately with the "generate" process, which is under review
between David and Volodya. More details from David. Pending the
outcome of that, we will have a clearer picture of how these classes
should be structured.

o Generate Process

The "generate" process is a central part of the build system and
currently very convoluted and confusing. David is trying to
understand it and will describe his analysis separately.

o Project-root/Jamfile

The requirement of having a "project-root.jam" file in addition to a
"Jamfile" for every project means that each simple single-Jamfile
project requires an additional file.

In fact, the Jamfile is the unnecessary one of the two, which leads
us to propose that the file performing the function of the current
"project-root.jam" be called "Jamfile", allowing simple projects to
exist with a single Jamfile. The current function of "Jamfile" is
really as a sub-Jamfile and should be called something else, but
we're not sure quite how to rename.

project-root.jam -> Jamfile, Jamroot ??
Jamfile -> SubJamfile ??

Any comments or suggestions?

 


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk