Boost logo

Boost :

From: Douglas Gregor (doug.gregor_at_[hidden])
Date: 2006-08-24 09:46:00


On Aug 24, 2006, at 4:49 AM, Magnus Ekdahl wrote:

> A while ago I had a problem with the MST algorithm in boost.
> Essentially
> profiling reviled that the MST algorithm was a bottleneck for the
> computation I was performing. Thus I had to investigate the
> performance
> bottlenecks in boost mst and with this mail I want to investigate the
> potential for improving current boost implementation.
>
> What is the proposed way of getting the prim algorithm updated in
> boost?

Just sent a note to the Boost mailing list, as you've done.

> First let me argue that there is room for improvement. I have made a
> test program that constructs 1000 random complete graphs with 1000
> nodes
> and the execution time is as follows
>
> actually implementing jarniks algorithm
> prim_minimum_spanning_tree 47.585s
>
> current prim (djikstra wrapper) 3m2638s (338% slower)
>
> kruskal_minimum_spanning_tree 40m57s (5063% slower)

Those times seem... horrible. What compiler are you using? With which
optimization flags? Can you post the sample code, so we can see what
is going wrong?

> I also want to take the opportunity to ask about the meaning of the
> minimum spanning tree algorithm as such, since efficient
> implementations
> in the future could be helped a lot by knowing exactly what the
> problem
> that should be solved is.

Wikipedia has a decent definition:

   http://en.wikipedia.org/wiki/Minimum_spanning_tree

> Given the hardness to understand the actual generality of the MST
> problem the current implementation tries to solve, I find it difficult
> to see how efficient algorithms can be heuristically selected. I.e.
> can
> you use num_edges(g) as an indicator of sparseness?

We would need to run some benchmarks to determine what heuristic to
use to dynamically select the most efficient algorithm. We haven't
really done this yet for the BGL, but we should. MST is one good
example; all-pairs shortest-paths is another good example, because,
e.g., the Floyd-Warshall algorithm is much better for dense graphs
while Johnson's algorithm is much better for sparse graphs. I imagine
that we could use some kind of threshold for num_edges(g)/
(num_vertices(g)*num_vertices(g)) to decide whether a graph is sparse
or dense.

   Doug


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk