Boost logo

Boost :

Subject: Re: [boost] [graph_parallel] Is a distributed directed edge list possible?
From: caustik (caustik_at_[hidden])
Date: 2011-01-05 22:58:01


On Wed, Jan 5, 2011 at 12:44 PM, Aydin Buluc <abuluc_at_[hidden]> wrote:

> Dave Abrahams <dave <at> boostpro.com> writes:
>
> >
> > At Wed, 5 Jan 2011 13:35:59 -0500 (EST),
> > Jeremiah Willcock wrote:
> > >
> > > On Mon, 3 Jan 2011, caustik wrote:
> > >
> > > > Is it currently possible to represent a highly connected graph where
> any
> > > > given vertex may contain more directed edges than can be held in a
> single
> > > > process' OutEdgeListS?
> > > >
> > > > I see that the distributed version of adjacency_list allows vertices
> to be
> > > > distributed across multiple processes in a process group, but I don't
> see
> > > > the ability to allow directed edges to be distributed across multiple
> > > > processes. Is this an intentional limitation?
> > >
> > > No, it is not possible. PBGL currently distributes graph vertices (a
> > > so-called 1-D distribution) and keeps each edge with its source
> > > vertex. To do what you are describing, we would need to implement 2-D
> > > (edge-based) distributions, which do not fit into the current PBGL
> > > design.
> >
> > Another thought I had, if graph connectivity is really super-high, is
> > that you might be able to use the inverse graph representation.
> >
>
>
> Caustik, what kind of kernels and/or algorithms
> you want to run on the distributed graph?
>
> Combinatorial BLAS has evolved quite a bit (with many more primitives,
> more flexible vector distributions, a few more example algorithms, and
> numerous bug fixes) from the 1.0 alpha version that is on the web.
>
> Until the next release (likely to happen in a month or two), I can share
> the active branch with you if your application seems to fit well. If it
> doesn't, it'll still be very useful for us to see what kind of primitives
> we do not support so that we can include them in the future.
>
> Cheers,
> - Aydin
>
>
I'm not familiar with an 'inverse graph' -- I take it that is where each
vertex holds a list of the edges which do not exist, and assumes all others
to exist? Unfortunately that would still be a problem for these graphs,
because not every vertex will be highly connected (there are some with no
connections, many with a few connections, any a few with many connections).

The graph I'm trying to build is one where there are leaf vertices which
hold data, and other vertices which hold subgraphs of the overall graph. So
a vertex may be a leaf, a subgraph containing leafs, a subgraph containing
other subgraphs, or a subgraph containing both leafs and other subgraphs.
The algorithms that it needs to run take any of those subgraphs, enumerate
all of it's vertices recursively (a vertex holding a subgraph can contain
other vertices which hold other subgraphs, etc), and run various operations
on the data.

Combinatorial BLAS sounds fascinating, I'll have to take a deeper look. Not
quite sure yet if even a 2-D distribution will be the right data model. I'm
beginning to lean more toward a map/reduce mechanism where each process
holds only local data which can then be modified and eventually merged in
the map/reduce functions.

caustik


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk