Boost logo

Boost Users :

Subject: Re: [Boost-users] distributed breadth first search doesn't work properly?
From: Mattia Lambertini (mat.lambertini_at_[hidden])
Date: 2010-12-13 16:37:20


> If I correctly intepretate the question "To what end?"
> you are wondering why it is the smallest value and not the one that it
> should have when being executed with a single process?

right.

> If so, this would impose a checking of the process ID and at the long end
> this would lead, IMHO, to a distributed-BFS version that is actually
> fairly the same as the serial version. After all, it would have to wait
> each time for process 0 reaching that node and the running time should
> become equal (maybe worse due to the distributed overhead) to the serial
> version. So would/should be the result of the distributed execution...

IMHO it doesn't. A distributed algorithm should return the same ouput of the
sequential algorithm but in less time (wall clock time). Obviously not all the
algorithms can be distributed efficiently but the goal of a distributed
algorithm is to spread the computation over many nodes(give a slice of global
computation to each node) and at the end you'll get the same result of the
sequential one. All other distributed algorithms, which i used, works this
way.

i don't want to sound arrogant but i still don't understand the reasons behind
a design like this one.(I don't demand that you provide all the answers :))
What does such a result mean? Is it usefull for something?

> This is what the documentation says:
> <quote>
> 3. Since multiple distance values may be written for each vertex, we must
> always choose the best value we can find to update the distance map. The
> distributed property map distance needs to resolve distances to the
> smallest distance it has seen. For instance, process 0 may find vertex u
> at level 3 but process 1 finds it at level 5: the distance must remain at
> 3. To do this, we state that the property map's role is as a distance map,
> which introduces an appropriate reduction operation:
>
> set_property_map_role(vertex_distance, distance);
> <\quote>

I've thought that the documentation, in that section, was talking about the
smallest distance from the root node.
Le me explain:

Process 0 has found that the distance from vertex s to vertex u is 5 through
the path: s -> n1 -> n2 -> n3 -> n4 -> u

But process 1 has found that the distance from vertex n3 to vertex u is equal
to 1 (they are connected). Thus in the communication phase (accordingly to
BSP[0]) process 0 and 1 exchange their knowledge of the world and the result
should be:
s -> n1 -> n2 -> n3 -> u

> Maybe you could tell why you are particularily into a distributed BFS?
> There are of course obvious reasons like "It exists and I have the
> hardware here, so I want to use it". It might be that your reason is more
> specific and there's an alternative or a work-around possible?!

we have to analyze very huge graphs (milions of nodes) and we simply want to
do it in the shortest possible time. We tried a sequential and a
parallel(shared memory) approach but it takes too much time.

I want to thank you very much for your time.

[0] http://www.osl.iu.edu/research/pbgl/documentation/process_group.html

-- 
Cordiali saluti,
Mattia Lambertini

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net