|
Boost Users : |
Subject: Re: [Boost-users] [BGL] Upper limits on graph size
From: Jeremiah Willcock (jewillco_at_[hidden])
Date: 2010-05-20 11:06:17
On Thu, 20 May 2010, Adam Spargo wrote:
> Hi, I am working on genome assembly software and hope that the BGL can
> save me a lot of development time, but before I make the investment in
> learning the library can somebody advise me on whether it is
> appropriate.
>
> My initial test sets will be quite small. However in the end I will want
> to scale up to on the order of a billion nodes, quite sparsely
> connected. We have the RAM and many CPUs, but will the code scale up
> this far?
For this level of scalability, we have the Parallel BGL (mostly in
boost/graph/distributed and libs/graph_parallel; more info at
<URL:http://www.osl.iu.edu/research/pbgl/>) that runs on
distributed-memory systems using MPI. We have successfully run tests up
to two billion or so vertices (16G undirected edges) on 96 machines (4GiB
of memory each). How much RAM and how many CPUs do you have? PBGL works
on clusters or SMP systems, but remember that RAM is the usual limit on
how many vertices you have on a single machine, not CPU speed. How many
edges do you have? Directed or undirected? How much data do you need to
attach to each vertex or edge? What kinds of algorithms do you want to
run?
-- Jeremiah Willcock
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net