Boost logo

Boost :

Subject: Re: [boost] [BGL] Trying to get a correclty working Parallel BFS code
From: Sandeep Gupta (gupta.sandeep_at_[hidden])
Date: 2009-11-03 22:44:07


On Mon, Nov 2, 2009 at 12:43 PM, Nick Edmonds <ngedmond_at_[hidden]>wrote:

>
> On Oct 31, 2009, at 1:00 AM, Sandeep Gupta wrote:
>
> On Fri, Oct 30, 2009 at 6:41 AM, Jeremiah Willcock <jewillco_at_[hidden]
>> >wrote:
>>
>> On Thu, 29 Oct 2009, Sandeep Gupta wrote:
>>>
>>> On Thu, Oct 29, 2009 at 1:12 PM, Jeremiah Willcock <jewillco_at_[hidden]
>>>
>>>> wrote:
>>>>>
>>>>
>>>> On Thu, 29 Oct 2009, Sandeep Gupta wrote:
>>>>
>>>>>
>>>>> On Wed, Oct 28, 2009 at 5:49 PM, Sandeep Gupta <
>>>>> gupta.sandeep_at_[hidden]
>>>>>
>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 27, 2009 at 9:31 PM, Jeremiah Willcock <
>>>>>>> jewillco_at_[hidden]
>>>>>>>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>> On Tue, 27 Oct 2009, Sandeep Gupta wrote:
>>>>>>>
>>>>>>>
>>>>>>>> The error snippet regarding predecessor map:
>>>>>>>>
>>>>>>>>
>>>>>>>> error: no match for call to
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
>>>>>>>>>> (const boost::detail::parallel::global_descriptor<unsigned int>&)'
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ../../../boost/property_map/parallel/impl/distributed_property_map.ipp:141:
>>>>>>>>>> error: no match for call to
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
>>>>>>>>>> (const boost::detail::parallel::global_descriptor<unsigned int>&,
>>>>>>>>>> boost::detail::error_property_not_found&, const
>>>>>>>>>> boost::detail::error_property_not_found&)'
>>>>>>>>>> ../../../boost/graph/parallel/properties.hpp:95: note: candidates
>>>>>>>>>> are:
>>>>>>>>>> T
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T)
>>>>>>>>>> const [with T = boost::detail::error_property_not_found]
>>>>>>>>>> ../../../boost/graph/parallel/properties.hpp:96: note:
>>>>>>>>>> T
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T,
>>>>>>>>>> T, T) const [with T = boost::detail::error_property_not_found]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jeremiah, Do you think that I should file a bug report for this.
>>>>>>>>>>
>>>>>>>>>> Although I was hoping (and actually needed quite urgently) that it
>>>>>>>>>>
>>>>>>>>>> would
>>>>>>>>> be
>>>>>>>>> minor issue and get fixed quickly.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You need to pass a predecessor map to the algorithm as one of the
>>>>>>>>>
>>>>>>>> parameters
>>>>>>>> (or put it as a part of your graph but making an external property
>>>>>>>> map
>>>>>>>> to
>>>>>>>> give
>>>>>>>> to the algorithm is easier).
>>>>>>>>
>>>>>>>> Hi Jeremy,
>>>>>>>>
>>>>>>>>
>>>>>>>> It took me a while but I finally figure out how to pass the
>>>>>>> predecessor
>>>>>>>
>>>>>>> map. Unfortunately it doesn't have any effect. Also, I might be
>>>>>>>> wrong
>>>>>>>> but I
>>>>>>>> don't see any logical reason why predecessor map should have any
>>>>>>>> bearing
>>>>>>>> on
>>>>>>>> the correctness of the depth. I have attached the new code. I am
>>>>>>>> not
>>>>>>>> able
>>>>>>>> to print out the predecessor because I am not able to figure out its
>>>>>>>> type.
>>>>>>>> Maybe you could help me resolve this.
>>>>>>>>
>>>>>>>>
>>>>>>>> It would be nice to have this code running. I need to profile graph
>>>>>>>>
>>>>>>> performance on a new machine by tomorrow. Again, thanks for you
>>>>>>> patience
>>>>>>> and
>>>>>>> time. I really appreciate you looking into this.
>>>>>>>
>>>>>>> Hi Jeremiah,
>>>>>>>
>>>>>>> Was just wondering if you had time to look into this or any
>>>>>>> suggestion
>>>>>>>
>>>>>>> on
>>>>>> how further proceed. I can get you the output of predecessor map if
>>>>>> that
>>>>>> would help.
>>>>>> Just that I haven't be been able to figure out what the is type of the
>>>>>> property map returned by make_distributed_property_map.
>>>>>> Please let me know your thoughts.
>>>>>>
>>>>>>
>>>>>> The person who really knows this library is not around as far as I
>>>>> know;
>>>>> he
>>>>> might be on vacation. The type returned from
>>>>> make_distributed_property_map
>>>>> is documented on <URL:
>>>>> http://www.osl.iu.edu/research/pbgl/documentation/graph/
>>>>> distributed_property_map.html>. One thing that you could do to debug
>>>>> things is to put in your own color map and look at what values it ends
>>>>> up
>>>>> with at the end of the algorithm, and possibly even to put in a local
>>>>> color
>>>>> map that prints out when elements are changed. That might be a lot of
>>>>> work
>>>>> for what could end up being an obvious problem (that I just don't know
>>>>> how
>>>>> to diagnose), though. BTW, does the example in
>>>>> libs/graph_parallel/example/breadth_first_search.cpp work for you? It
>>>>> appears to be doing the same thing you want to do. If that works, it
>>>>> could
>>>>> be a problem in your visitor.
>>>>>
>>>>>
>>>>> I derived this code from
>>>>> graph_parallel/example/breadth_first_search.cpp.
>>>>>
>>>>> The problem is that I don't understand the input and the output of the
>>>> example code. It would be great if I could get an explanation for the
>>>> format of the input file. Then I can transform my graph into that
>>>> format
>>>> and pass to the algorithm. That would be enough for my current purpose.
>>>>
>>>> As time permits I would try out your suggestions and let know of update.
>>>> In
>>>> the meantime I am hoping that I would get input from other relevant
>>>> authors.
>>>>
>>>>
>>> This is what I could tell from the code.
>>>
>>> Input:
>>> http://people.sc.fsu.edu/~burkardt/data/metis_graph/>
>>> metis_graph.html<
>>>
http://people.sc.fsu.edu/%7Eburkardt/data/metis_graph/metis_graph.html>
>>>
>>>
>>> Output:
>>> http://www.graphviz.org/doc/info/lang.html
>>>
>>>
>>> Thanks Jeremiah. The link was helpful. I understood the input graph
>>> format.
>>> Its for undirected graph only. However the BFS output is still coming out
>>> to
>>> be incorrect. I tried with a sample line graph of 4 nodes and it
>>> exhibits the same problem as before.
>>>
>>> The distances of nodes on process 2 is not getting updated. On the other
>> hand dijkstra shortest paths example is working correctly.
>>
>> Thanks
>> Sandeep
>>
>
> I rewrote most of your code before I realized that you're outputting all
> your distances on process 0, even for vertices process 0 doesn't own. The
> large value you see is std::numeric_limits<distance_type>::max. When you
> call get(distance, vertex(i, g)) for some i not owned by the current process
> a ghost cell is created for the requested value and a request sent to the
> owner of that vertex for the correct value. get() then returns whatever the
> default value is for the distance property map, which in this case is
> infinity (i.e. std::numeric_limits<distance_type>::max.
>
> In some cases you may get the correct distance value because process 0 has
> previously requested that value (and it may or may not be up to date). If
> process 0 has never requested that distance value and it doesn't own the
> vertex, then you'll get infinity.
>
> You need to gather all your data before you output it from a single node
> (i.e. issue a get() or request() which is a get with no return value, for
> each data element). If you want your code to be scalable you should output
> your data in a distributed fashion from all nodes at once. Remember that
> updated values won't be guaranteed to arrive until the next synchronization
> step.
>
> If you want the modified version of your code let me know and I'll finish
> it and ship it your way.
>
> -Nick
>
>
> Hi Nick,
>
   Thanks so much for looking into this. Sorry, I couldn't get to it earlier
due to other concerns. I corrected the mistake you mentioned and I don't get
numeric_limit<>::max() for any nodes anymore.

I believe the graph/distributed/graphviz.hpp was outputting the distance in
correct fashion.

Unfortunately, the output is still coming out to be incorrect.
It was also the case with output from graphviz.hpp.

For the test graph that we mentioned before, essentially
0 --> 1 --> 2 --> 3,
0 --> 4 --> 3

the distance output is as follows:
global node-id : distance
0 : 0
1 : 1
2 : 2
3 : 1 //(wrong: should be 2)
4 : 0 // should be 1

If possible can you send me your version of the code. Let me know if you
need mine (the updated version).

Thanks
Sandeep


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk