|
Boost : |
Subject: Re: [boost] [BGL] Trying to get a correclty working Parallel BFS code
From: Sandeep Gupta (gupta.sandeep_at_[hidden])
Date: 2009-10-27 23:33:21
On Tue, Oct 27, 2009 at 12:33 PM, Sandeep Gupta <gupta.sandeep_at_[hidden]>wrote:
>
>
> On Tue, Oct 27, 2009 at 10:27 AM, Jeremiah Willcock <jewillco_at_[hidden]>wrote:
>
>> On Tue, 27 Oct 2009, Sandeep Gupta wrote:
>>
>> On Mon, Oct 26, 2009 at 11:29 PM, Jeremiah Willcock <jewillco_at_[hidden]
>>> >wrote:
>>>
>>> On Mon, 26 Oct 2009, Sandeep Gupta wrote:
>>>>
>>>> Hi All,
>>>>
>>>>> I have spent significant effort on parallel bfs. Currently its almost
>>>>> working. I would really appreciate some help to get it running
>>>>> correctly.
>>>>> I
>>>>> have attached the code (self contained and requires no input file).
>>>>> I think I have followed all the guidelines regarding distributed
>>>>> property
>>>>> maps mentioned in the docs (
>>>>>
>>>>>
>>>>> http://www.osl.iu.edu/research/pbgl/documentation/graph/breadth_first_search.html
>>>>> ).
>>>>> I am out of ideas of where things can could have gone wrong.
>>>>>
>>>>>
>>>> The graph is distributed by source vertex across all processors, and so
>>>> you
>>>> need to add them on every processor (or at least on the processor that
>>>> owns
>>>> them, which is probably not processor 0). Thus, your graph is missing
>>>> whichever edges would be on processor 1, leading to incorrect results.
>>>> Try
>>>> adding all of the edges on all processors and see if this fixes your
>>>> problem.
>>>>
>>>> -- Jeremiah Willcock
>>>>
>>>>
>>>>
>>> Jeremiah, thanks so much looking into this promptly. I took cues from
>>> building distributed graph section of distributed adjacency list page
>>> which
>>> has an example:
>>>
>>> ///---------------Begin example
>>>
>>> Graph g(n); // initialize with the total number of vertices, n
>>> if (process_id(g.process_group()) == 0) {
>>> // Only process 0 loads the graph, which is distributed automatically
>>> int source, target;
>>> for (std::cin >> source >> target)
>>> add_edge(vertex(source, g), vertex(target, g), g);
>>> }
>>>
>>> // All processes synchronize at this point, then the graph is complete
>>> synchronize(g.process_group());
>>> //---------End
>>>
>>> Looking at this, it seems that distribution happens automatically which
>>> is
>>> what the manual says.
>>> This is further confirmed when I output the subgraphs at each process.
>>>
>>
>> OK. I was mistaken on this point then. Could you please try the
>> following graph with a distribution that puts vertices 0-3 on rank 0 and 4
>> on rank 1 (just add three dummy vertices on the end and use a block
>> distribution)?
>>
>> 0 -> 1
>> 1 -> 2
>> 2 -> 3
>> 0 -> 4
>> 4 -> 3
>>
>> See if the depths are reasonable for that graph. Also, the contents of
>> the predecessor maps output by your search (both the original one and the
>> graph I just gave) would be nice too; that is just an extra map you can give
>> to BFS and you can just dump it the same way you dump depth values.
>>
>>
>
>> I tried adding predecessor map but I don't think its currently. No
>> reduction function is defined for predecessor map. Please see below for the
>> attached error.
>>
> I tried the bfs with your input (and distribution) and the depth number
> are coming out to be incorrect. This is a surely a better input to debug
> this problem:-).
> The depth value of node n=3 should be d=2 but its coming out to be d=0.
> I believe the update message from node n=4 on process 1 is not received or
> updated by process 0.
> Attached below is the output of levels:
> Finished Parallel BFS
> Node ID : Level
> 0 : 0
>
> 1 : 1
> 2 : 2
> 3 : 0
> 4 : 1
> 5 : 4294967295
> 6 : 4294967295
> 7 : 4294967295
>
> The two subgraphs on each process
> Subgraph Begin
> n0 -> n1;
> n0 -> n4;
> n1 -> n2;
> n2 -> n3;
> Subgraph End
>
> Subgraph Begin
> n4 -> n3;
> Subgraph End
>
> Thanks,
> Sandeep
>
> //-------------------------
> The error snippet regarding predecessor map:
>
> error: no match for call to
> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
> (const boost::detail::parallel::global_descriptor<unsigned int>&)'
>
>
> ../../../boost/property_map/parallel/impl/distributed_property_map.ipp:141:
> error: no match for call to
> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
> (const boost::detail::parallel::global_descriptor<unsigned int>&,
> boost::detail::error_property_not_found&, const
> boost::detail::error_property_not_found&)'
> ../../../boost/graph/parallel/properties.hpp:95: note: candidates are: T
> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T)
> const [with T = boost::detail::error_property_not_found]
> ../../../boost/graph/parallel/properties.hpp:96: note: T
> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T,
> T, T) const [with T = boost::detail::error_property_not_found]
>
>
> Jeremiah, Do you think that I should file a bug report for this.
Although I was hoping (and actually needed quite urgently) that it would be
minor issue and get fixed quickly.
Thanks,
Sandeep
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk