Boost logo

Boost Users :

Subject: Re: [Boost-users] [BGL] Upper limits on graph size
From: Jeremiah Willcock (jewillco_at_[hidden])
Date: 2010-05-20 11:50:01


On Thu, 20 May 2010, Adam Spargo wrote:

> Hi, thanks for your reply, the fact that you have run something with two
> billion vertices tells me that it's worth looking into. I will probably come
> back to when I have done some runs. At the moment I'm mostly writing the code
> that gets the information to build my graph.

I would recommend using compressed_sparse_row_graph if you are working
with directed (or bidirectional) graphs; the graph structure is read-only
but it is memory-efficient and designed to scale to large graphs. There
are several constructors that are also designed to minimize memory usage
during graph construction.

> I have 500GB RAM on a single node for prototyping, plus a cluster of 4000
> nodes with 4GB to 64GB RAM each so it should be cool.

You should be able to use sequential BGL for your testing at first, then;
you might even be able to use your 500GB machine for production runs. We
have not tried to scale Parallel BGL up to the 4000-node level yet, but
are working on performance improvements to get it to scale that far. We
can give you prerelease versions you need scalability beyond what you are
able to get with the released code. However, with your graph sizes, you
might only need 8-16 machines with 64G each so you may not need the
scalability improvements. It also depends on the complexity (as in big O)
of your algorithms, though. We are also working on taking advantage of
shared-memory systems; specific support for them is not in the released
code yet (other than that you can run multiple MPI processes on the same
machine).

> I'm still very much in the proof of concept stage. I'm one of the first to
> explicitly construct the overlap graph in a genome assembly program, most
> have something like it but can't really play with it before they barf the
> answer. So hopefully the BGL will give be the power to explore my graph, but
> like I said - early days.

OK. Please keep in touch and report any problems (or successes) you have.
We always appreciate users who are using BGL and PBGL at large scale.

-- Jeremiah Willcock


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net