Boost logo

Boost :

Subject: Re: [boost] [SoC] BGL project idea
From: Jeremiah Willcock (jewillco_at_[hidden])
Date: 2010-07-15 21:22:34


On Thu, 15 Jul 2010, Eric Wolf wrote:

> Dear Jeremiah,
>
> thanks for the fast response.
>
> 1) I'll give some general clarifications on the algorithm, then
> 2) I want to talk (read/write) about how to implement it and your
> very valueable suggestions.
>
> 1) First of all some clarification (i'm not sure you need
> them though :) )
>
> Jeremiah Willcock <jewillco_at_[hidden]> writes:
>> On Wed, 14 Jul 2010, Eric Wolf wrote:
>>> Jeremiah Willcock <jewillco_at_[hidden]> writes:
>>>>>> Why are you computing the transitive closure explicitly when it is not
>>>>>> part of the algorithm's documented output? It there a reason that you
>>>>>> can't use the SCC structure (and the connections between them)
>>>>>> directly to compute the reduction without storing the full closure?
>>>> Is there a reason you're not using the current transitive_closure
>>>> algorithm in BGL?
>
> An already computed transitive closure can't be used, as there is a need
> to compute it in all cases, as some times it must be checked, whether an
> edge is already in the to be computed closure.

I don't understand what you're saying, and it seems to be contradicted by
your next part about using Vladimir's transitive_closure code.

>>> I will modify transitive_closure from Vladimir Prus and use successor
>>> sets. This will give better time complexity and better memory usage.
>
> So the transitive closure is stored in the successor sets in what I
> imagine to be the next version of transitive_reduction.hpp. No explicit
> transitive closure needed any more.

That's just a transitive closure stored as an adjacency list, though,
right? Why not just use that rather than an adjacency_matrix, or make it
configurable by the user (since looking up whether edges are in the
closure is faster with a matrix when the closure is dense).

> I cite some text, which would came later in your email:
>
>> If I understand correctly, though, you always need
>> to make some kind of transitive closure graph internally, even if the
>> user doesn't want it;
>
> Correct. Thats what I will use successor sets for in my next imagined
> version.

OK.

>> I don't remember if that was just for the condensation graph or if it
>> was a transitive closure of the whole input graph.
>
> Just for the condensation graph.

That's what I thought. BTW, did you see
<boost/graph/create_condensation_graph.hpp>? Is anything in there useful
to you?

>> You also need an explicit condensation graph as temporary storage for
>> doing the reduction, too, or am I mistaken?
>
> I'm not sure about this. The transitive_reduction of the condensation
> graph is only touched to insert edges. It is not read in the “core“
> algorithm. It is needed, to construct the transitive_reduction of the
> original graph I think. (If the reduction of the original graph is
> stored as an explicit graph, or if I write vertex pairs to an output
> iterator isn't important though.)

So you don't need to read back any of the transitive reduction of the
condensation graph as part of creating it?

> 2) Now I answer the other points in your response:
>
>> OK. Please try to contribute that as a separate algorithm so other
>> people can use it as well.
>
> Ehh ... I don't understand, what you mean. A separate algorithm to the
> previous version? Who would want the previous version (of
> transitive_reduction)?

I think that was referring to the successor-set-based transitive closure
algorithm, but check my previous email to be sure.

>>>>>> Why are there functions that do the transitive closure as well as the
>>>>>> reduction? Is having both of them a common use case? It appears that
>>>>>> there is a lot of redundant code between the reduction-with-closure
>>>>>> and reduction-only functions.
>>>>>
>>>>> It's the same algorithm. The algorithm for computing the closure and the
>>>>> reduction is essentially the same. So you could have the other one for
>>>>> free if you generate one of them. I doubt that it is a common use case,
>>>>> I admit, but I thought one should have the possibility to get both in
>>>>> one run, if its for free. So the reduction-with-closure and
>>>>> reduction-only functions are essentially duplicates of each other.
>>>>
>>>> OK. Is there a way to factor out the common code?
>>>
>>> ** long lines below **
>>>
>>> Several ways to accomplish that come to my mind, but I'm unsure which
>>> route to go and ask you for some advice.
>>
>> The closure and reduction will always have the same number of vertices
>> as the original graph, right?
>
> yes
>
>> They just add or remove some edges, if I understand correctly.
>
> yes
>
>> What if you didn't output a graph at all, just a list of edges (pairs
>> of vertices, since the edge may not exist in the original graph)?
>> Then a user could easily ignore the list, or feed it back into a graph
>> class's constructor to create a result graph. You could then use
>> named parameters and make the default be to ignore the outputs.
>
> That's a great idea. But I couldn't get the grasp of bgl named
> parameters until now. I thought bgl would use the boost::parameter
> library, but in my perception the documentation and what's done in
> named_parameters.hpp don't fit.

I think BGL predates Boost.Parameter. In new algorithms, what we do is
convert the BGL named parameter structure to a Boost.Parameter one and
then use Boost.Parameter (with a bunch of BGL-specific wrapper functions)
to work with it. Look at <boost/graph/random_spanning_tree.hpp> and the
trunk/1.44 version of <boost/graph/astar_search.hpp> for examples.

>> If I understand correctly, though, you always need to make some kind
>> of transitive closure graph internally, even if the user doesn't want
>> it; I don't remember if that was just for the condensation graph or if
>> it was a transitive closure of the whole input graph. You also need
>> an explicit condensation graph as temporary storage for doing the
>> reduction, too, or am I mistaken?
>
> The reduction could be easily written to an output iterator I imagine,
> but the computation of the transitive closure of the condensation graph
> must be done in all cases and stored somehow. Now in the successor sets.

OK.

>> So here are some other options I've thought of; if some of the stuff I
>> wrote in this paragraph is wrong, that might change which ones are
>> possible.
>>
>> 1 (easy). Accept an Output Iterator that will get the reduction as a
>> list of edges, built using the vertices from the original graph. Do
>> the same for the closure. Use named parameters to make both of these
>> outputs ignored by default, but they can also be used to create new
>> graphs.
>
> Even it the reduction is not used or wanted by the user, it still would
> be computed from the reduction of the condensation graph. But maybe it
> is an accepable annoyance?

I thought you could save some time by computing only the closure; isn't a
reduction on even the condensation graph extra work to compute? Remember,
too, you can probably assume that code like:

if (!boost::is_same<ReductionOutput, dummy_iterator>::value) {
   compute reduction
}

would be handled well by compilers.

>> 2 (harder, but not because of template tricks). Create two
>> semi-implicit graphs, one for the closure and one for the reduction;
>> return both from your algorithm. In this case, your graphs would only
>> store a map from each vertex to its component (i.e., vertex in the
>> condensation graph), which vertex is the representative of its
>> component (for reductions), and the condensation graph (either its
>> closure or reduction). You would then need to do the reduction even
>> if the user doesn't need it, but you would not need to store the
>> reduction or closure of anything but the condensation graph. You
>> could also just have two variants of the algorithm, one that creates
>> the condensation graph and its transitive closure, and another that
>> creates both the condensation graph, its closure, and its reduction.
>> That will likely save you code still, and named parameters can be used
>> to determine in one function whether the reduction is necessary to
>> compute (and this can probably be done with a run-time if statement on
>> a property of a template argument, without needing any fancy
>> metaprogramming beyond type_traits).
>
> And a two functions to construct the reduction and the closure from
> the cond.gr.rd. and the cond.gr.cl. and the mappings. Yeah, thats a
> possible way.

I don't mean two functions. Here's what I had in mind in more detail:

Assume:
- component == SCC
- G is the graph
- .+ is the transitive closure operator
- .- is the transitive closure/reduction operator
- C is the condensation graph
- comp(v) maps from a vertex in G to its corresponding component in C
- next(v) maps from a vertex in G to another vertex in the same component
(creating a loop in each component)
- rep(v) is a bit flag that is true for one vertex in G in each component

Now, as far as I understand, an edge (v, w) is in G+ iff (comp(v),
comp(w)) is an edge in C+. An edge (v, w) is in G- iff either w = next(v)
or rep(v) and rep(w) and (comp(v), comp(w)) is an edge in C-. You can
define a graph type based on those computations. For the closure, you
would store C+ and comp() explicitly, and then compute things like
out_edges(v, G+) based on those using the formula above. For the
reduction, you would store C-, comp(), next(), and rep(), then use the
formulas to compute out_edges(v, G-). The other graph operations like
vertices() would just forward to the original G. Using that kind of
approach, you can create graph objects (things that model the graph
concepts) that can be used to get the transitive closure and reduction of
G while only storing information about the closure and reduction of C and
a couple of other property maps on G.

>> Your options:
>>
>>> 1) Define a Graph type which absorbs all add_vertex and add_edge
>>> operations like /dev/null in unix, then write one implementation
>>> function like ...
>>
>> I don't like this option because I don't think the reduction can be
>> removed completely, since I believe you need an explicit graph for the
>> reduction when you are computing (you use already filled-in parts of
>> it to create other parts, right?).
>
> Nope. The reduction of the condensation graph is only written, not read
> in the core algorithm. It gets read if one tries to build the transitive
> reduction of the whole graph.
>
> But I don't like that option, too. It is ugly.

OK. When do you need to read it back to produce the closure of the full
graph? Don't you just need equivalents of the comp(), next(), and rep()
maps that I talked about above, plus a way to emit the edges of the
reduction of the condensation graph?

>>> 2) Use boost::enable_if like that: (In the following ... denotes an
>>> ellipsis of zero or more parameters, not varargs.) ...
>>
>> This option is feasible and probably easier than you think it is;
>> named parameters will take care of the dispatching, so you probably
>> won't need enable_if.
>
> If you think it is feasible, then I would like to take that route. One
> remedy still remains in my perception. If the user doesn't want the
> transitive reduction, an unused local variable remains in the main
> implementation function ...
>
> Can that be removed somehow by fancy template stuff?

Something simple like the if statement I showed above should be enough to
deal with that issue. If not, it's pretty easy to use mpl::if_ to remove
the reduction code more completely.

> What's your favorite choice?

I like the implicit graph idea because of its elegance and reduction in
output size, but it is more work to implement (the formulas I put down
above are fairly simple but turning them into iterators is more work; if
you want to go this way, I can give more detailed pseudocode closer to
what you actually need to write to model BGL concepts). Otherwise, either
streaming edges to an output iterator (my option 1) or one of your options
is reasonable. Your option 1 is probably equivalent to your option 2,
since you can special-case the code based on your black-hole graph type
and so not actually do the graph insertions when it is used; remember too
that named parameters can be used so the user doesn't actually see the
black-hole graph.

-- Jeremiah Willcock


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk