Boost logo

Boost-Build :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2003-05-13 02:50:30


David Abrahams wrote:

> I have been working all evening and am fading fast, but I promised to
> post something about this before morning so Volodya would have a
> chance to look at it if he's back from holiday... so I'll be brief:

Dave, I'm really sorry that I did not reply yesterday. It turned out I had to
waste all the day on some unpleasant business --- checking exam papers, so I
was offline most of the time.

> Currently, there is a "multiple" parameter passed through the
> generation process designed to handle cases like this:

[...]

> The ability to choose "non-greedy" behavior is still needed, though,
> for many reasons. The rule is that non-greedy behavior is chosen
> when the invoking (rightmost) generator has multiple sources but no
> *'s in its signature [in the current sources, the * corresponds to
> calling the generator "composing", but I think signatures are a
> more-general and accurate way of expressing this]. That was done to
> handle cases like:
>
> %.dlp %.o -> %.exe (built from exactly one .o and one .dlp)
> %.cpp -> %.o
> %_parser.whl -> %.cpp
> %.dlp -> %.cpp
> %.wd -> %.whl %.dlp
>
> In this case the desired transformation is:
>
> --->foo.whl-->foo_parser.cpp-->foo_parser.o---
> / \
> foo.wd--< >-->bar.exe
> \ /
> --->foo.dlp-----------------------------------
>
> If the .dlp is converted "greedily" to a .cpp, a suboptimal
> transformation will be found:
>
> --->foo.whl----->foo_parser.cpp-->foo_parser.o-->bar.exe
> / -->foo_lexer.cpp--->foo_lexer.o--->
> foo.wd--< /
> \ /
> --->foo.dlp-
>
>
> My first contention is that the criterion for choosing greediness
> doesn't do what it is supposed to do. For example, we could go back
> to a "composing" signature for .exe above:
>
> %.dlp*,%.o* -> %.exe (built from any number of .o and .dlp files)
>
> Although that clearly wouldn't change the optimal transformation
> sequence, it would turn greediness back on, and we'd get the
> suboptimal transformation.

The question is what's 'optimal transformation'. When generator needs one dlp
and one cpp file, there's no 'optimal one' --- there's single one that could
work. For the second case, there's ambiguity, but is there 'optimal
transformation'. The situation is that exe can consume both .dlp and .o,
where .o can be created from .dlp. That's messy, and at least deserves a
warning. And it's also a corner case, so optimal behaviour here is not so
imporant.

> My second contention is that greedy conversion is not really the
> answer it appears to be. I'm going to base this assertion on the
> idea that there's no good criterion for applying it... so I don't feel
> I need to justify it further ;-)

I think the criteria is simple. Generator passes 'multiple' to recursive
'generators.construct' call if it can do something sensible to multiple
returned targets. Generator with single input can do something. Generator
with two input types cannot.

> Instead, I'm going to propose what I think is the "correct" solution.
> The whole point of introducing this greedy behavior was to eliminate
> ambiguity [as shown above**]. Instead, I propose we figure out how to
> detect when two transformation sequences are equivalent. What we want
> to be able to do, when confronted with two sequences that return
> "extra" targets and appear to be ambiguous, is to detect when
> processing the extra targets of each one would result in the same
> overall set of build actions. Then we can discard either one of the
> two sequences, arbitrarily, without loss of information.

(very very side note: it would be dreadfully cool if we can avoid searching
for one of sequences: i.e. after trying .dlp->.cpp generator to detect that
trying .whl->.cpp would not bring anything new).

Provided we can disambiguate, we don't need greediness. But... do we get
anything by not converting extra targets greedily and passing them up to
top-level generator? I think that if transformation found for that extra
target is the same in both cases, we call well convert greedily, since it's
simpler (IMO).

If tranformations can differ --- it's a big question. [Here I was going to
present example where passing extra target up will fail, but found an example
where both approaches will fail]. Consider

.whl -> .cpp, .h
.cpp -> .o
.o* -> .exe
.h -> moc_%.cpp # This is QT transformation.

See the gotcha? The extra .h target produced from .whl is considered as if it
were a source, and processed with moc, while it should not:

.whl -----> .cpp ----------------> .o ----- .exe
\ /
\--> .h ---> moc_%.cpp -> .o ---/

Let's put this case aside for a while. The real difference between greedy and
pass-up approached is that in first case extra targets can only be converted
to types along the chain where they were generated. In example above, .h can
be converted only to .cpp, or .o, but any other transormations you might have
are ignored. With passing-up, you'll examine transformation from extra target
to all the type top-level generator can consume. This *sounds* better, but
only sounds.

>
> Here's my formula for detecting equivalence.:
>
> 1. Observe that each generator search path unwinding from successfully
> finding a source type or types from the target types is a linear
> chain (which may include an arbitrary of extra targets to be
> processed as sources).

I'm afraid it's not so. This is again QT example:

.ui -------->.cpp ----->.o
\ /
\----->.h (has type UIC_H)

Here, '.h' is not actually 'extra target'. It's explicitly created by
UIC,UIC_H -> CPP generator and is needed to create the .cpp. There are some
subtle points --- i.e. the .h files in reallity is not used by the tool which
generates cpp, but I hope you get the point.

> 2. Observe that each search path (P1,P2) is a sequence of generator
> choices, beginning with sources and ending with targets.
>
> 3. Algorithm:
>
> q1 = P1
> q2 = P2
> while q1 is not empty and q2 is not empty:
> g1 = head(q1)
> g2 = head(q2)
> if g1 != g2:
> if source_signature(g1) intersects source_signature(g2):
> P1 is not equivalent to P2
> else:
> P1 is equivalent to P2
>
> otherwise, P1 and P2 are the same path
>
> Basically, this says that if the two paths ever decide to do
> different things to the same sources, they are not equivalent.
> Otherwise, further exploration of the extra sources produced by
> either one will lead to all the transformations of the other.

So, the idea is that if we have
t1-------- (generator 1)
/
source --(last common generator)-<
\
t2-------- (generator 2)

we declare that there's no abmiguity between selecting generator 1 and
generator 2, because if we select the first, then t2 will be produced anyway,
and the transformation starting with generator 2 will be executed as well.

But how do you prove this? We need to make sure this works at least as good
as current scheme, if we're to move.

- Volodya

 


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk