Boost logo

Boost Users :

Subject: Re: [Boost-users] distributed breadth first search doesn't work properly?
From: Cedric Laczny (cedric.laczny_at_[hidden])
Date: 2010-12-14 05:08:23


On Monday, 13. December 2010 22:37:20 Mattia Lambertini wrote:
> > If I correctly intepretate the question "To what end?"
> > you are wondering why it is the smallest value and not the one that it
> > should have when being executed with a single process?
>
> right.
>
> > If so, this would impose a checking of the process ID and at the long end
> > this would lead, IMHO, to a distributed-BFS version that is actually
> > fairly the same as the serial version. After all, it would have to wait
> > each time for process 0 reaching that node and the running time should
> > become equal (maybe worse due to the distributed overhead) to the serial
> > version. So would/should be the result of the distributed execution...
>
> IMHO it doesn't. A distributed algorithm should return the same ouput of
> the sequential algorithm but in less time (wall clock time). Obviously not
> all the algorithms can be distributed efficiently but the goal of a
> distributed algorithm is to spread the computation over many nodes(give a
> slice of global computation to each node) and at the end you'll get the
> same result of the sequential one. All other distributed algorithms, which
> i used, works this way.
>
> i don't want to sound arrogant but i still don't understand the reasons
> behind a design like this one.(I don't demand that you provide all the
> answers :)) What does such a result mean? Is it usefull for something?
>

I agree with you. It's just that it was designed this way for some reason and
that's exactly how it seems to work actually. To find out the real reasons for
this design decision, you could try to mail the authors of the PBGL-BFS
personally.

> > This is what the documentation says:
> > <quote>
> > 3. Since multiple distance values may be written for each vertex, we must
> > always choose the best value we can find to update the distance map. The
> > distributed property map distance needs to resolve distances to the
> > smallest distance it has seen. For instance, process 0 may find vertex u
> > at level 3 but process 1 finds it at level 5: the distance must remain at
> > 3. To do this, we state that the property map's role is as a distance
> > map, which introduces an appropriate reduction operation:
> >
> > set_property_map_role(vertex_distance, distance);
> > <\quote>
>
> I've thought that the documentation, in that section, was talking about the
> smallest distance from the root node.
> Le me explain:
>
> Process 0 has found that the distance from vertex s to vertex u is 5
> through the path: s -> n1 -> n2 -> n3 -> n4 -> u
>
> But process 1 has found that the distance from vertex n3 to vertex u is
> equal to 1 (they are connected). Thus in the communication phase
> (accordingly to BSP[0]) process 0 and 1 exchange their knowledge of the
> world and the result should be:
> s -> n1 -> n2 -> n3 -> u
>

This is indeed a good point... Which makes me refer again to an eMail directly
to the original authors. I hoped that keeping the discussion alive would
either result into a solution found by discussion or the participation of the
authors or someone more familiar with the topic.
This discussion could also indicate the need to update the documentation,
perhaps with a concrete example demonstrating the behavior of the PBGL-BFS.

> > Maybe you could tell why you are particularily into a distributed BFS?
> > There are of course obvious reasons like "It exists and I have the
> > hardware here, so I want to use it". It might be that your reason is more
> > specific and there's an alternative or a work-around possible?!
>
> we have to analyze very huge graphs (milions of nodes) and we simply want
> to do it in the shortest possible time. We tried a sequential and a
> parallel(shared memory) approach but it takes too much time.
>
>
> I want to thank you very much for your time.

You are welcome.

> [0] http://www.osl.iu.edu/research/pbgl/documentation/process_group.html

Best,

Cedric


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net