Boost logo

Boost Users :

Subject: Re: [Boost-users] [BGL] Upper limits on graph size
From: Bo Jensen (jensen.bo_at_[hidden])
Date: 2010-05-20 15:31:37


On Thu, May 20, 2010 at 5:50 PM, Jeremiah Willcock <jewillco_at_[hidden]> wrote:
> On Thu, 20 May 2010, Adam Spargo wrote:
>
>> Hi, thanks for your reply, the fact that you have run something with two
>> billion vertices tells me that it's worth looking into. I will probably come
>> back to when I have done some runs. At the moment I'm mostly writing the
>> code that gets the information to build my graph.
>
> I would recommend using compressed_sparse_row_graph if you are working with
> directed (or bidirectional) graphs; the graph structure is read-only but it
> is memory-efficient and designed to scale to large graphs.  There are
> several constructors that are also designed to minimize memory usage during
> graph construction.
>
>> I have 500GB RAM on a single node for prototyping, plus a cluster of 4000
>> nodes with 4GB to 64GB RAM each so it should be cool.
>
> You should be able to use sequential BGL for your testing at first, then;
> you might even be able to use your 500GB machine for production runs.  We
> have not tried to scale Parallel BGL up to the 4000-node level yet, but are
> working on performance improvements to get it to scale that far.  We can
> give you prerelease versions you need scalability beyond what you are able
> to get with the released code.  However, with your graph sizes, you might
> only need 8-16 machines with 64G each so you may not need the scalability
> improvements.  It also depends on the complexity (as in big O) of your
> algorithms, though.  We are also working on taking advantage of
> shared-memory systems; specific support for them is not in the released code
> yet (other than that you can run multiple MPI processes on the same
> machine).
>
>> I'm still very much in the proof of concept stage. I'm one of the first to
>> explicitly construct the overlap graph in a genome assembly program, most
>> have something like it but can't really play with it before they barf the
>> answer. So hopefully the BGL will give be the power to explore my graph, but
>> like I said - early days.
>
> OK.  Please keep in touch and report any problems (or successes) you have.
> We always appreciate users who are using BGL and PBGL at large scale.
>
> -- Jeremiah Willcock
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>

Jeremiah,
Can you say a bit more about the work you mention on BGL for
shared-memory ? Has some implementation been done ? I ask because I am
considering to use BGL for parallel computing on shared-memory
machines.
Thanks
Bo


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net