Boost logo

Boost-Build :

From: Jurko Gospodnetić (jurko.gospodnetic_at_[hidden])
Date: 2008-05-18 11:22:39


   Hi Volodya et all.

> for example
>
> bjam g=v a=b/c=d a=x/c=y
>
> will build everythin in just two variants
>
> g=v a=b c=d
> g=v a=x c=y
>
> whereas without "/", we'd build 4 variants. This behaviour is not very common.

   To clarify:

     bjam g=v a=b c=d a=x c=y

   means the same thing as:

     bjam g=v a=b,x c=d,y

   And builds the following variants:

     g=v a=b c=d
     g=v a=b c=y
     g=v a=x c=d
     g=v a=x c=y

> So, the conclusion is to break current "/" syntax and make the new syntax be:
>
> bjam [options] [targets] [properties] ( "--" [targets] [properties] ) *
>
> Here, the "--" token separates independent groups of targets and properties.
> Inside a given group of targets/properties, properties are specified as:
>
> feature=value1
>
> No "/" is allowed. Just as now, it's OK to specify several targets, or several values
> for a given feature, inside one group. The shortcut of:
>
> feature=value1,..,valueN
>
> will still work. The only behaviour that will not be possible in the new scheme is
> specifying features that will apply everywhere. That is, my original example of
>
> bjam g=v a=b/c=d a=x/c=y
>
> would have to be written as:
>
> bjam g=v a=b c=d -- g=v a=x c=y
>
> with "g=v" duplicate. This does not seem like a very bad thing to me, given that
> building several groups of features appears to be rare thing.
>
> Any objections before I go an implement/document this?

   Maybe its just me 'not being able to grasp the whole picture' but my
gut kept saying that we might be doing something wrong here, precisely
because I could not seem to get the feeling that I have grasped the
whole picture. That is why I will try to coherently spell out my
thoughts on this here...

   First the conclusion (so you do not need to read through my thought
process below :-) ):

   * Implicit feature values of the format 'a=b' should not be allowed.
They could be allowed but that does not seem to be needed and has great
potential for causing much confusion.

   * It should be possible to specify multiple feature values for a
single feature by separating them using a comma character. For features
whose values themselves contain a comma this should be allowed but note
that there is no way to allow this for free features. Also a big warning
should be documented about free feature values not being specifiable
using this syntax and why.

     Example: bjam a=1,2,3 b=4,5 c=d

   * Old / separator syntax used for grouping together multiple property
values should really be scrapped in all cases. This includes specifying
property values for a single target as well as specifying a group of
properties that always go together. This simply is not
easily/consistently usable in the face of feature values containing the
/ character (e.g. path separators), especially free features.

   * Volodya's '--' parameter extension to essentially specify separate
build requests on the same command-line should be implemented but
hopefully using a different separator parameter such as '--new', '--and'
or '--new-build-request'. I also suggest using the '--' parameter with
its usual meaning of stating that parameters coming behind it should be
treated as non-option parameters even though they start with the prefix
'--'.

   If the '--' parameter is used it also introduces a restriction that
no implicit feature or target name should have the value '--' which
should be both checked in code and documented. Note that this also means
that Boost Build can never build a file target named '--'.

     Example: bjam a=11 b=12 c=13 -- a=21 b=22 c=23 -- a=31 b=32

   I have some optional implementation ideas related to this though
which still need to be reviewed/discussed:
     - using a different separator instead of '--'.
     - making the separator configurable
     - allowing any parameter to be treated as a non-option even though
it starts with the prefix '--' by escaping/prefixing it with a slash
('\') character. Parameters actually starting with '\--' should be
escaped once more, etc...
     - allowing the -- separator to be escaped using \-- so it can be
used as a target name instead of a separator.
     - adding a '--global' separator identifing a group of targets &
properties that are to be added to all previous build requests on the
same command line.

   * We need to provide a way to specify the build request as a response
file passed to bjam. This is needed to cover any more complex build
request cases not otherwise covered by our regular command-line Boost
Build invocation semantics and do so without incurring an additional
cost of having the used Jamfiles parsed multiple times.

   * And the final topic is not really a conclusion but to a note not to
forget reimplementing the current && feature value separator as the
current implementation has its own problems including some related to
those detected with the '/' separator. My guess is to not sort entered
properties and use them in the order given but this will most likely be
cleanly/efficiently/easily implementable only after porting Boost Build
to Python.

   And now the thought trail leading up to the conclusion above.

   The first question that popped to my mind was 'Why is any of this
fancy syntax needed?'.

   The very basic syntax and the simplest possible syntax we need is
where each feature value can be specified at most once, different
features can have the same value but a single feature getting multiple
values defined is treated as an error:

     bjam a=1 b=2 c=3 --> builds 'a=1 b=2 c=3'
     bjam a=1 b=1 c=1 --> builds 'a=1 b=1 c=1'
     bjam a=1 b=1 c=2 --> builds 'a=1 b=1 c=2'

     bjam a=1 a=2 b=3 c=4 --> error as a has 2 values specified.

   So why is this not enough? With this, if you do need to build
multiple variants, you can write a 'build runner script' executing
multiple bjam commands.

   Thinking of this I found only three reasons why this simple solution
might not be considered satisfactory:

   (1) Developer convenience while developing a project and wanting to
quickly build different build variants without having to prepare a
temporary 'build runner script' for each such build.

   (2) Each bjam based build requires that all the used Jamfiles be
parsed. That means that if we need to build multiple variants those same
Jamfiles will need to be parsed once for each variant.

   (3) If we want to build variants based on all combinations of
multiple feature values for multiple features then the number of
separate bjam runs grows exponentially with the number of features used
(meaning that the number of lines we need to write in those build runner
scripts). For an example suppose you want to build the following build
variants: feature a with values 1,2 & 3, feature b with values x,y and
feature c with values 100,200,300,400. Now with the simple syntax
explained above you would need to prepare a build runner script
containing the following bjam calls:

     bjam a=1 b=x c=100
     bjam a=1 b=x c=200
     bjam a=1 b=x c=300
     bjam a=1 b=x c=400
     bjam a=1 b=y c=100
     bjam a=1 b=y c=200
     bjam a=1 b=y c=300
     bjam a=1 b=y c=400
     bjam a=2 b=x c=100
     bjam a=2 b=x c=200
     bjam a=2 b=x c=300
     bjam a=2 b=x c=400
     bjam a=2 b=y c=100
     bjam a=2 b=y c=200
     bjam a=2 b=y c=300
     bjam a=2 b=y c=400
     bjam a=3 b=x c=100
     bjam a=3 b=x c=200
     bjam a=3 b=x c=300
     bjam a=3 b=x c=400
     bjam a=3 b=y c=100
     bjam a=3 b=y c=200
     bjam a=3 b=y c=300
     bjam a=3 b=y c=400

   Problem (1) (developer convenience) seems to be the least serious as
I assume most developers generally build only a fixed set of variants
and only rarely need to 'improvise' outside that variant set. Also I
assume most integrate such bjam calls into their IDE (e.g. we have MSVC
projects on Windows run from inside Microsoft Visual Studio) and so
already need some sort of build runner scripts to provide the necessary
interface.

   Problem (2) (multiple Jamfile parsing) seems like it should actually
be solved by upgrading Boost Jam to parse its Jamfiles in much less time
that it takes to do so with the current implementation. One possible way
to attack this problem would be to cache/compile its parsed Jamfile
information and then have have it loaded much more quickly on subsequent
runs. However, this does not seem like a 'short term' task and so the
only 'quick enough' solution to this problem seems to be to allow
multiple variants to be built from the same bjam run.

   Problem (3) (exponential number of explicit bjam build runs needed to
build all variants based of multiple features & multiple feature values)
seems like a real problem. Such multiple feature based builds are common
enough, e.g. building all project variants to test your project build or
builing multiple Boost library build variants etc. and manually
maintaining such build runner scripts containing a list of all possible
feature combinations seems fragile. Whenever you add a new feature or
change some feature value, you need to update a lot of those bjam calls.

   The way to attempt to compensate for problems found with the above
simple Boost Build call semantics would be to allow a single Boost Build
run to build multiple build variants. Note though that whichever way we
do this there will always be use cases where describing them will be too
complicated or way too verbose to be done on the bjam call command line.
That is why, whichever way this extension is implemented we should not
lose sight of making it possible to specify a completely custom set of
build property-sets and have that built by Boost Build. This means that
the Boost Build call semantics extensions we choose should deal with
problems (1), (2) & (3) in most cases but user/developer should still be
able to somehow 'script out' more complex build requests and not be
penalized for it by having the build system take much longer (e.g. as
described in problem (2) above) or something similar.

   Due to problem (2) above and the fact that it is not possible to
solve that problem quickly enough in a more general way, we should
implement a way to pass a response file to bjam. That way we avoid any
maximal command-line length issues for more complex build requests as
well as allow for such requests to be more easily prepared, possibly by
some sort of a customized script.

   And now for the Boost Build call semantics extensions themselves...

   The first possible extension would be to allow multiple values for a
single feature to be specified, as in:

     bjam a=1 a=2 b=3 c=4

       Which would build:
         a=1 b=3 c=4
         a=2 b=3 c=4

     bjam a=1 a=2 a=3 b=4 b=5 c=6

       Which would build:
         a=1 b=4 c=6
         a=1 b=5 c=6
         a=2 b=4 c=6
         a=2 b=5 c=6
         a=3 b=4 c=6
         a=3 b=5 c=6

   This seems like a natural extension and offers no bad side-effects or
additional restrictions.

   Now the next extension is to allow implicit feature values so we can
specify only 'x' and 'y' instead of 'a=x' and 'a=y' in case a is an
implicit feature. This is useful but has a tricky borderline case of
handling implicit feature values of the format 'b=c'. This can be
handled by either not allowing it or by treating it as a recognized
implicit feature value but which ever way is chosen it should be
documented together with a note that such implicit feature values use
should really not be needed.

   My suggestion would be to disallow it as I have not seen the need for
a single such use case in practice and it has great potential for
causing much confusion... Pretty much seems like an 'evil' thing to do. :-)

   Note though that free features can not be made implicit so if there
actually does exist an implicit feature value of the format 'b=c' then
we know about it from the parsed Jamfiles and therefore no user request
can be ambiguous.

   Next extension could be to allow multiple feature values to be
specified together separated by a comma (','). This seems like a
convenience extension but can cause confusion for feature values
containing a comma character and also ambiguity when used with free
features for which we then do not know whether the comma character
should be treated as a part of the free feature value or a separator
between different feature values.

   Here I would suggest accepting the comma syntax with feature values
containing commas and disallowing its use with free features. Also a big
warning should be documented about free feature values not being
specifiable using this syntax and why.

   An old extension used was grouping together multiple property values
using the slash ('/') character. This includes specifying property
values for a single target as well as specifying a group of properties
that always go together. This simply is not easily/consistently usable
in the face of feature values containing the / character (e.g. path
separators), especially free features for the reasons Volodya noted in
his previous post.

   The next extension is the one Volodya mentioned with using the '--'
parameter to essentially specify separate build requests on the same
command-line. This is a simple extension and I am all for it as it
allows us to deal with the problem (2) above and introduces only a
slight restriction that no implicit feature or target name may have
value '--'. The good side is that the resulting build command lines are
easy to read. One additional bad side is that we can not specify
'global' targets & properties to be applied to all build requests on the
same command line.

   The -- separator usage here is not really standard. Usually this
separator is used to indicate that all later parameters should be
treated as 'non-options' even though they start with the prefix '--'.
That is why perhaps it would be better to use a different separator like
'--new-build-request' or '--and' and use the '--' separator with its
usual meaning.

   I have some optional implementation ideas related to this though
which still need to be reviewed/discussed:
     - using a different separator instead of '--'.
     - making the separator configurable
     - allowing any parameter to be treated as a non-option even though
it starts with the prefix '--' by escaping/prefixing it with a slash
('\') character. Parameters actually starting with '\--' should be
escaped once more, etc...
     - allowing the -- separator to be escaped using \-- so it can be
used as a target name instead of a separator.
     - adding a '--global' separator identifing a group of targets &
properties that are to be added to all previous build requests on the
same command line.

   The one final extension I see it the current '&&' separator which has
its own set of problems but this is a separate story.

   If you got this far - thanks for reading... :-)

   Hope this helps.

   Best regards,
     Jurko Gospodnetić


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk