Boost logo

Boost :

Subject: Re: [boost] [BGL] Trying to get a correclty working Parallel BFS code
From: Sandeep Gupta (gupta.sandeep_at_[hidden])
Date: 2009-11-09 20:22:03


On Wed, Nov 4, 2009 at 3:39 PM, Nick Edmonds <ngedmond_at_[hidden]>wrote:

>
> On Nov 4, 2009, at 6:22 PM, Sandeep Gupta wrote:
>
> On Wed, Nov 4, 2009 at 3:11 PM, Sandeep Gupta <gupta.sandeep_at_[hidden]
>> >wrote:
>>
>>
>>>
>>> On Wed, Nov 4, 2009 at 12:18 PM, Nick Edmonds <ngedmond_at_[hidden]
>>> >wrote:
>>>
>>>
>>>> On Nov 3, 2009, at 10:44 PM, Sandeep Gupta wrote:
>>>>
>>>> On Mon, Nov 2, 2009 at 12:43 PM, Nick Edmonds <ngedmond_at_[hidden]
>>>>
>>>>> wrote:
>>>>>>
>>>>>
>>>>>
>>>>> On Oct 31, 2009, at 1:00 AM, Sandeep Gupta wrote:
>>>>>>
>>>>>> On Fri, Oct 30, 2009 at 6:41 AM, Jeremiah Willcock <
>>>>>> jewillco_at_[hidden]
>>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> On Thu, 29 Oct 2009, Sandeep Gupta wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Thu, Oct 29, 2009 at 1:12 PM, Jeremiah Willcock <
>>>>>>>> jewillco_at_[hidden]
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, 29 Oct 2009, Sandeep Gupta wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Oct 28, 2009 at 5:49 PM, Sandeep Gupta <
>>>>>>>>>> gupta.sandeep_at_[hidden]
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> On Tue, Oct 27, 2009 at 9:31 PM, Jeremiah Willcock <
>>>>>>>>>>>
>>>>>>>>>>> jewillco_at_[hidden]
>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, 27 Oct 2009, Sandeep Gupta wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The error snippet regarding predecessor map:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> error: no match for call to
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
>>>>>>>>>>>>>>> (const boost::detail::parallel::global_descriptor<unsigned
>>>>>>>>>>>>>>> int>&)'
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ../../../boost/property_map/parallel/impl/distributed_property_map.ipp:141:
>>>>>>>>>>>>>>> error: no match for call to
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
>>>>>>>>>>>>>>> (const boost::detail::parallel::global_descriptor<unsigned
>>>>>>>>>>>>>>> int>&,
>>>>>>>>>>>>>>> boost::detail::error_property_not_found&, const
>>>>>>>>>>>>>>> boost::detail::error_property_not_found&)'
>>>>>>>>>>>>>>> ../../../boost/graph/parallel/properties.hpp:95: note:
>>>>>>>>>>>>>>> candidates
>>>>>>>>>>>>>>> are:
>>>>>>>>>>>>>>> T
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T)
>>>>>>>>>>>>>>> const [with T = boost::detail::error_property_not_found]
>>>>>>>>>>>>>>> ../../../boost/graph/parallel/properties.hpp:96: note:
>>>>>>>>>>>>>>> T
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T,
>>>>>>>>>>>>>>> T, T) const [with T =
>>>>>>>>>>>>>>> boost::detail::error_property_not_found]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jeremiah, Do you think that I should file a bug report for
>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Although I was hoping (and actually needed quite urgently)
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> minor issue and get fixed quickly.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You need to pass a predecessor map to the algorithm as one of
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> parameters
>>>>>>>>>>>>>>
>>>>>>>>>>>>> (or put it as a part of your graph but making an external
>>>>>>>>>>>>> property
>>>>>>>>>>>>> map
>>>>>>>>>>>>> to
>>>>>>>>>>>>> give
>>>>>>>>>>>>> to the algorithm is easier).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Jeremy,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It took me a while but I finally figure out how to pass the
>>>>>>>>>>>>>
>>>>>>>>>>>>> predecessor
>>>>>>>>>>>>
>>>>>>>>>>>> map. Unfortunately it doesn't have any effect. Also, I might be
>>>>>>>>>>>>
>>>>>>>>>>>> wrong
>>>>>>>>>>>>> but I
>>>>>>>>>>>>> don't see any logical reason why predecessor map should have
>>>>>>>>>>>>> any
>>>>>>>>>>>>> bearing
>>>>>>>>>>>>> on
>>>>>>>>>>>>> the correctness of the depth. I have attached the new code. I
>>>>>>>>>>>>> am
>>>>>>>>>>>>> not
>>>>>>>>>>>>> able
>>>>>>>>>>>>> to print out the predecessor because I am not able to figure
>>>>>>>>>>>>> out
>>>>>>>>>>>>> its
>>>>>>>>>>>>> type.
>>>>>>>>>>>>> Maybe you could help me resolve this.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It would be nice to have this code running. I need to profile
>>>>>>>>>>>>> graph
>>>>>>>>>>>>>
>>>>>>>>>>>>> performance on a new machine by tomorrow. Again, thanks for you
>>>>>>>>>>>>>
>>>>>>>>>>>> patience
>>>>>>>>>>>> and
>>>>>>>>>>>> time. I really appreciate you looking into this.
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Jeremiah,
>>>>>>>>>>>>
>>>>>>>>>>>> Was just wondering if you had time to look into this or any
>>>>>>>>>>>> suggestion
>>>>>>>>>>>>
>>>>>>>>>>>> on
>>>>>>>>>>>>
>>>>>>>>>>>> how further proceed. I can get you the output of predecessor
>>>>>>>>>>> map if
>>>>>>>>>>> that
>>>>>>>>>>> would help.
>>>>>>>>>>> Just that I haven't be been able to figure out what the is type
>>>>>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>> property map returned by make_distributed_property_map.
>>>>>>>>>>> Please let me know your thoughts.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The person who really knows this library is not around as far as
>>>>>>>>>>> I
>>>>>>>>>>>
>>>>>>>>>>> know;
>>>>>>>>>> he
>>>>>>>>>> might be on vacation. The type returned from
>>>>>>>>>> make_distributed_property_map
>>>>>>>>>> is documented on <URL:
>>>>>>>>>> http://www.osl.iu.edu/research/pbgl/documentation/graph/
>>>>>>>>>> distributed_property_map.html>. One thing that you could do to
>>>>>>>>>> debug
>>>>>>>>>> things is to put in your own color map and look at what values it
>>>>>>>>>> ends
>>>>>>>>>> up
>>>>>>>>>> with at the end of the algorithm, and possibly even to put in a
>>>>>>>>>> local
>>>>>>>>>> color
>>>>>>>>>> map that prints out when elements are changed. That might be a
>>>>>>>>>> lot
>>>>>>>>>> of
>>>>>>>>>> work
>>>>>>>>>> for what could end up being an obvious problem (that I just don't
>>>>>>>>>> know
>>>>>>>>>> how
>>>>>>>>>> to diagnose), though. BTW, does the example in
>>>>>>>>>> libs/graph_parallel/example/breadth_first_search.cpp work for you?
>>>>>>>>>> It
>>>>>>>>>> appears to be doing the same thing you want to do. If that works,
>>>>>>>>>> it
>>>>>>>>>> could
>>>>>>>>>> be a problem in your visitor.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I derived this code from
>>>>>>>>>> graph_parallel/example/breadth_first_search.cpp.
>>>>>>>>>>
>>>>>>>>>> The problem is that I don't understand the input and the output of
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>> example code. It would be great if I could get an explanation for
>>>>>>>>> the
>>>>>>>>> format of the input file. Then I can transform my graph into that
>>>>>>>>> format
>>>>>>>>> and pass to the algorithm. That would be enough for my current
>>>>>>>>> purpose.
>>>>>>>>>
>>>>>>>>> As time permits I would try out your suggestions and let know of
>>>>>>>>> update.
>>>>>>>>> In
>>>>>>>>> the meantime I am hoping that I would get input from other relevant
>>>>>>>>> authors.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is what I could tell from the code.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Input:
>>>>>>>> http://people.sc.fsu.edu/~burkardt/data/metis_graph/>
>>>>>>>> <
http://people.sc.fsu.edu/%7Eburkardt/data/metis_graph/>
>>>>>>>> <http://people.sc.fsu.edu/%7Eburkardt/data/metis_graph/>
>>>>>>>>
>>>>>>>> metis_graph.html<
>>>>>>>>
>>>>>>>> http://people.sc.fsu.edu/%7Eburkardt/data/metis_graph/metis_graph.html
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Output:
>>>>>>>> http://www.graphviz.org/doc/info/lang.html
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks Jeremiah. The link was helpful. I understood the input graph
>>>>>>>> format.
>>>>>>>> Its for undirected graph only. However the BFS output is still
>>>>>>>> coming
>>>>>>>> out
>>>>>>>> to
>>>>>>>> be incorrect. I tried with a sample line graph of 4 nodes and it
>>>>>>>> exhibits the same problem as before.
>>>>>>>>
>>>>>>>> The distances of nodes on process 2 is not getting updated. On the
>>>>>>>> other
>>>>>>>>
>>>>>>>> hand dijkstra shortest paths example is working correctly.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Sandeep
>>>>>>>
>>>>>>>
>>>>>>> I rewrote most of your code before I realized that you're outputting
>>>>>> all
>>>>>> your distances on process 0, even for vertices process 0 doesn't own.
>>>>>> The
>>>>>> large value you see is std::numeric_limits<distance_type>::max. When
>>>>>> you
>>>>>> call get(distance, vertex(i, g)) for some i not owned by the current
>>>>>> process
>>>>>> a ghost cell is created for the requested value and a request sent to
>>>>>> the
>>>>>> owner of that vertex for the correct value. get() then returns
>>>>>> whatever
>>>>>> the
>>>>>> default value is for the distance property map, which in this case is
>>>>>> infinity (i.e. std::numeric_limits<distance_type>::max.
>>>>>>
>>>>>> In some cases you may get the correct distance value because process 0
>>>>>> has
>>>>>> previously requested that value (and it may or may not be up to date).
>>>>>> If
>>>>>> process 0 has never requested that distance value and it doesn't own
>>>>>> the
>>>>>> vertex, then you'll get infinity.
>>>>>>
>>>>>> You need to gather all your data before you output it from a single
>>>>>> node
>>>>>> (i.e. issue a get() or request() which is a get with no return value,
>>>>>> for
>>>>>> each data element). If you want your code to be scalable you should
>>>>>> output
>>>>>> your data in a distributed fashion from all nodes at once. Remember
>>>>>> that
>>>>>> updated values won't be guaranteed to arrive until the next
>>>>>> synchronization
>>>>>> step.
>>>>>>
>>>>>> If you want the modified version of your code let me know and I'll
>>>>>> finish
>>>>>> it and ship it your way.
>>>>>>
>>>>>> -Nick
>>>>>>
>>>>>>
>>>>>> Hi Nick,
>>>>>>
>>>>>> Thanks so much for looking into this. Sorry, I couldn't get to it
>>>>>>
>>>>> earlier
>>>>> due to other concerns. I corrected the mistake you mentioned and I
>>>>> don't
>>>>> get
>>>>> numeric_limit<>::max() for any nodes anymore.
>>>>>
>>>>> I believe the graph/distributed/graphviz.hpp was outputting the
>>>>> distance
>>>>> in
>>>>> correct fashion.
>>>>>
>>>>> Unfortunately, the output is still coming out to be incorrect.
>>>>> It was also the case with output from graphviz.hpp.
>>>>>
>>>>> For the test graph that we mentioned before, essentially
>>>>> 0 --> 1 --> 2 --> 3,
>>>>> 0 --> 4 --> 3
>>>>>
>>>>> the distance output is as follows:
>>>>> global node-id : distance
>>>>> 0 : 0
>>>>> 1 : 1
>>>>> 2 : 2
>>>>> 3 : 1 //(wrong: should be 2)
>>>>> 4 : 0 // should be 1
>>>>>
>>>>> If possible can you send me your version of the code. Let me know if
>>>>> you
>>>>> need mine (the updated version).
>>>>>
>>>>>
>>>>
>>>> Given the description of your problem above, and assuming you're using a
>>>> block distribution with 8 vertices, it sounds like you're initializing
>>>> the
>>>> distance to the first vertex on *each* process to 0, as opposed to only
>>>> the
>>>> distance to the source. The vertex with global index 4 is actually
>>>> local
>>>> index 0 on process one using the above distribution. If you initialize
>>>> the
>>>> distances to the vertices with global indices 0 and 4 to 0, then your
>>>> results make sense. Let me know if that's not the case, but I surmise
>>>> that
>>>> it is.
>>>>
>>>> Very much so :-). Thanks for getting me started with by catching these
>>>>
>>> minor but fatal mis-understandings with PBFS code. I understood that call
>>> vertex(i,g) creates a descriptor for ith global vertex.
>>>
>>> Also there's a problem in your depth labeling visitor. In the case of
>>>
>>>> cross-processor edges the process that owns the target of the edge
>>>> doesn't
>>>> necessarily have the distance to the source of the edge available
>>>> locally.
>>>> This will lead to incorrect distance labels even though tree and
>>>> non-tree
>>>> edges will still be correctly recognized since the BFS is
>>>> level-synchronized. The fastest way to handle this is to modify the BFS
>>>> queue to also pass distance data (which is how the shortest paths
>>>> algorithms
>>>> do it). You could also send the data separately and probe for it on the
>>>> receiving side. Check out the distributed betweenness centrality code
>>>> for
>>>> how to create a distributed queue which contains tuples of data one
>>>> element
>>>> of which is a vertex (in order to know where to route the data).
>>>> Integrating this type of vertex queue (i.e. one that contains a data
>>>> payload) with BFS basically just requires handling the data payload on
>>>> the
>>>> receiving side.
>>>>
>>>> I would try this out. So basically the default call
>>>>
>>>
>>> boost::breadth_first_search
>>> (g, start,
>>> boost::visitor(boost::make_bfs_visitor(boost::record_distances(distance,
>>> boost::on_tree_edge()))));
>>>
>>> wouldn't work. I assumed that distributed BFS implementation did exactly
>>> what you mentioned.
>>> Although I would write my own, parallel visitor as well per your
>>> suggestions.
>>>
>>> thanks
>>> Sandeep
>>>
>>>
>>>
>>> Let me also add that I did modified code that initializes only the
>>>
>> distance to 0th vertex of processor 0. In this case i am facing the
>> dilemma
>> of what should I pass as start vertex for processes > 0. I can't pass the
>> global start vertex (probably the type won't be acceptable to
>> breadth_first_search.
>>
>> I will dig further into the betweeness centrality and dijkstra code to
>> figure this out. Appreciate your input.
>>
>
> The short answer is, just pass the same source vertex on all processors.
>
> Explanation: Each processor will push the source vertex on the distributed
> queue but all the non-local pushes end up being no-ops since the source
> vertex is black at the end of the first BFS level when the messages arrive.
> The source vertex just has to be graph_traits<Graph>::vertex_descriptor,
> which you can construct using vertex() call if you want to generate it from
> a global vertex index. The fact that all processors push the source vertex
> is an artifact of reusing the sequential BFS code, it could be changed but I
> haven't found it to be an issue thus far.
>
> This actually brings up an interesting feature of BFS, its possible to
> start a BFS from multiple source vertices by passing a different source on
> each processor (you can also pass a queue containing additional starting
> vertices if you need > num_procs sources). The strongly connected
> components algorithm uses this approach to run many BFSs on disconnected
> subgraphs in parallel.
>
> One last note on your visitor, my statement was incorrect, sorry. I was
> looking at some other code and confused it with yours, your depth labeling
> visitor will work fine because it 'pushes' distance labels. Basically it
> writes the distance to the vertex it pushes onto the queue into the
> distributed property map at the same time (actually immediately before).
> This means that at the next synchronization step both the distance and the
> vertex descriptor will arrive, in fact the ordering of the messages insures
> that the distance will arrive prior to the vertex on the queue. Sorry about
> that, I was looking at another visitor that was 'pulling' data, which is
> much more problematic.
>
> Hopefully that solves all your problems and my apologies again on leading
> you to believe there was something wrong with your visitor.
>
> The data model is a little tricky to wrap your head around, but once you've
> written some code it should become more intuitive.
>
>
> Hi Nick,
>
  Thanks for the explanation of the distributed BFS. This is indeed quite
helpful. Especially at later stages. I haven't gotten to a point to comment
on the data model but the current architecture is swell from what I can
judge.

 I hesitate to ask explanation at every minor detail but I now confused
about passing the same vertex descriptor. Specifically I didn't quite grasp
the statement "source vertex just has to be
graph_traits<Graph>::vertex_descriptor, which you can construct using
vertex() call if you want to generate it from a global vertex index".
The source vertex has to graph_traits<Graph>::vertex_descriptor which I
takes two arguments the index and the graph --these two parameters can only
identify local vertices. For global vertices we need global descriptors
which also require the owner.

Appreciate your help in clarifying this.

Thanks
Sandeep


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk