Boost logo

Boost Users :

Subject: Re: [Boost-users] distributed breadth first search doesn't work properly?
From: Cedric Laczny (cedric.laczny_at_[hidden])
Date: 2010-12-11 06:38:00


In addition to my previous answer, your observed behavior more and more seems
reasonable to me. It may seem uncorrect, but that's a question of the point-
of-view IMHO. Solutions to some problems differ when going from sequential to
distributed computing and the concept chosen by the PBGL implies that the
"best" value for the distance should be stored here.
But now to explain why I further support the observed behavior. The algorithms
in (P)BGL are based on visitors that define event points. One of those is the
initialization of a vertex. Due to the distributed behavior, you have two (or
more) processes that can start independently. Thus, these two processes will
usually move from intializing to discovering (another event point) at different
moments. If you look at the code of the distributed_bfs_visitor you will see
the tree_edge() function definition which relies on the stored value of the
source in order to assign the appropriate distance value to the target of an
edge. Due to the distributed computing, the vertex n4 can/will start
discovering while the vertex n0 is _still_ initializing (the formulation
probably is not 100% correct as I just want to illustrate the general idea).
Because vertex n4 has not yet been discovered by n0 (only initialized or, even
more, maybe default initialized) it will discover vertex n3 and assign it the
discovery-value of n4 (which is 0) plus 1. When n0 now discovers vertex n4, it
would like to assign it the value 1, but this is prohibited by the
set_property_map_role(). The same will hold for n3.
What further supports this, is the label (distance-value) of vertex n1. In
your single-process call it gets a value of 2, while in the multi-process call
this is changed to 1. Because this is a BFS, n1 is discovered in the same
level as n3 via n4. Thus, it gets the same value as n3 and will not be
assigned a value by the discovery via n0 (where it seems to be one level
deeper in the underlying graph).

The only thing I have not come up with a me-satisfying explanation is why the
second BFS starts at n4 (I mean, it is the "last" vertex). I suspect that this
is due to an initialization by vertex n0 and it only then can start a second
process. Because, after all, it is one of the vertices that will be
initialized first, probably even the first to intialize vertex, besides n0.
If the second (or third etc.) process would be started arbitrarily, than the
results of the multi-core run should differ between some consecutive executions
at least (if it was truly random).
In this context it would also be interesting to see how the whole approach
behaves when it is run on more than two processes or with a different graph,
probably having an initial vertex (start) with degree > 2 (possibly 3 for
easiness).

Best,

Cedric

On Friday, 10. December 2010 17:06:34 Mattia Lambertini wrote:
> > Perhaps you could attach the code of a minimal working example?
>
> of course, i can give you a link to the example i've tested [0] and here
> below is the graph i used (METIS):
>
> 5 9 1
> 3 1
> 2 2 4 1 5 2
> 2 7 4 3
> 5 1
> 1 1 2 1
>
> [0]
> http://www.boost.org/doc/libs/1_44_0/libs/graph_parallel/example/breadth_fi
> rst_search.cpp
>
> To compile the example you have to link:
>
> boost_graph_parallel-mt
> boost_mpi-mt
> boost_system-mt
>
> Thanks for your time.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net