Boost logo

Boost Users :

Subject: Re: [Boost-users] Using BGL for huge grid graphs
From: Nicholas Edmonds (ngedmond_at_[hidden])
Date: 2014-01-21 03:06:23


On Jan 15, 2014, at 3:10 AM, Knut Krause <knut.krause_at_[hidden]> wrote:

> Hi,
>
> I'm planning on using the BGL on huge graphs the represent a grid where each
> vertex is a cell connected to it's 8 neighbours. I think something around
> 2,500,000,000 vertices would be a good number to approach but it would be
> great if it could handle (a lot) more later.
>
> If I got this graph I have to find the shortest path between two vertices.
>
> Does anyone here have experiences with graphs that large? Is this even worth a
> try on "normal" hardware?
>
> The edge weights would be stored in the vertices. So if you want to go from
> (a) to (b) you'd have to pay the cost stored in (b). The graph is not directed
> either.
>
> I read about BGLs CSR graph
> http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/compressed_sparse_row.html
> but since I honestly have no clue how to use BGL yet I thought I'd first hear
> your opinons.

I know you asked about the sequential BGL, but as a data point, I’ve run the Parallel BGL on graphs of this scale. ~100 (quite old) cluster nodes with 4G each or 16-32 16G nodes, connected via a low-latency interconnect (e.g., not Ethernet/TCP), were sufficient depending on the algorithm. Strong scaling improvements are certainly possible up to a point if you have more machines available. The large surface-to-volume ratio of grid graphs should minimize contention, though in the case of shortest paths the distribution of your edge weights matters quite a bit as well. CSR is essential for memory efficiency but is straightforward to substitute in place of an adjacency list.

I’m not sure what qualifies as “normal” hardware but if a modest cluster with Myrinet/Infiniband is reasonable than graphs of the scale you mention are certainly reasonable. I suppose it’s possible to run the PBGL in one of the public cloud environments, but the networking performance tends to be so abysmal with most of them that I would expect significantly increased runtimes.

Hope that data-point is helpful.

Regards,
Nick


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net