Boost logo

Boost Users :

From: Douglas Gregor (doug.gregor_at_[hidden])
Date: 2007-09-12 15:21:47


On Sep 12, 2007, at 10:19 AM, Johan Oudinet wrote:
> pper_at_[hidden]> wrote:
>> Here are my limitations :
>>
>> - Very large graph ( > 40 million vertices)
>> - About 4 to 8 edge per vertex (Sparse)

That should be feasible with compressed_sparse_row_graph.

   40M vertices * 8 bytes per vertex = 320MB for vertex storage
   40M vertices * 8 edges per vertex * 4 bytes per edge = 1.28GB

>> I want to associate to each edge a property P such that :
>> edge (A,B) returns P
>> edge (B,A) returns -P (Directed)
>> BUT I do not want to duplicate the storage of P.
>>
>> If edge (A,B) exists then (B,A) also exists. (Bidirectional)

Well, you can have both (A, B) and (B, A) hold pointers to a P object
on the heap, along with a flag that states which edge is the forward
edge.

If the compressed sparse row graph supported undirectedS or
bidirectionalS, one of those would work well for you.

>> I also want to be able to access all edges connected to vertex A as
>> quickly as possible (so I guess using in_edges and out_edges in a
>> directed adjacency list will be too slow).

Why will in_edges and out_edges be too slow? You're just iterating
over a vector for each, or moving through indices in an array.

>> I think the compressed_sparse_row_graph might be the way to go by
>> storing all duplicate edges (the only implementation for the
>> moment is
>> directedS), but I need a way to not duplicate the properties
>> associated
>> with each edge.
>>
>> All advice will be greatly appreciated !!
>
> I agree with you, the only graph type you can use in BGL with very
> large graphs is the compressed_sparse_row_graph.

Right. adjacency_list has a bit too much overhead.

> You may use the singleton pattern for properties associated with each
> edge, so you can avoid creating multiple instances of property object.
>
> Otherwise, I use the BCG graph
> (http://www.inrialpes.fr/vasy/cadp/man/bcg.html) to handle very large
> graphs, but this graph format is not documented yet (a technical
> report is on-going), so I have to use the BCG environment providing by
> the CADP toolbox.

An undocumented format... That's unfortunate, since it precludes any
interoperability with the rest of the world.

There is a highly-compressed graph representation that is compatible
with the BGL, here:

        http://homer.informatics.indiana.edu/~nan/webgraph/

It's targeted a web graphs, but it may be useful for this.

        - Doug


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net