[Graph] CSR from list of tuples

Dear all, I need to construct (and use) a graph that can handle millions of nodes, with minimal overhead, and as suggested I am using a CSR graph. As I understand from the documentation, it is unmodifiable. Arcs are created at runtime in a parallel fashion, and my choice was to use a vector of std::tuple<std::size_t, std::size_t, int> (that is, node, node, weight) with Intel's TBB. This is the problem: can I avoid temporary vectors to construct my graph? Right now the only way I see is to create the vector of arcs (as std::pair) and the vector of weights, but this is a waste of time and memory. Do you have any helpful suggestions? Cheers & Thanks!

On Tue, May 20, 2014 at 8:52 AM, Sensei <senseiwa@gmail.com> wrote:
Dear all,
I need to construct (and use) a graph that can handle millions of nodes, with minimal overhead, and as suggested I am using a CSR graph. As I understand from the documentation, it is unmodifiable.
Arcs are created at runtime in a parallel fashion, and my choice was to use a vector of std::tuple<std::size_t, std::size_t, int> (that is, node, node, weight) with Intel's TBB.
This is the problem: can I avoid temporary vectors to construct my graph?
Right now the only way I see is to create the vector of arcs (as std::pair) and the vector of weights, but this is a waste of time and memory.
Do you have any helpful suggestions?
Cheers & Thanks!
(apologies for the double-post, had to update my mailing list subscription) There is an in-place constructor to the CSR graph, but you'll need three separate vectors (sources, targets, properties) rather than a vector of tuples. See the in-place ctors here<http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/compressed_sparse_row.html> . HTH, Nick

On 20/05/14 23:12, Nick Edmonds wrote:
There is an in-place constructor to the CSR graph, but you'll need three separate vectors (sources, targets, properties) rather than a vector of tuples. See the in-place ctors here <http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/compressed_sparse_row.html>.
Thanks Nick, I know about the in-place constructor, and that is what I used before I needed a weighted graph. But now I need to have weights on arcs, and since the arcs are chosen among N^2 node couples (and there is no way I can avoid checking N^2 node pairs), I opted for Intel's TBB. It is convenient to use a TBB concurrent vector of tuples since push_back is guaranteed to work, and that is just what I need. However, if I switch to a concurrent vector for arcs and another one for weights, I might need a semaphore, hurting performances. Moreover, I've been thinking about using a concurrent vector and a hash map from pairs to weights, but I don't know if this can be useful in a boost CSR constructor. I really hope I could get rid of these temporaries! Thanks!

On Wed, May 21, 2014 at 4:39 AM, Sensei <senseiwa@gmail.com> wrote:
On 20/05/14 23:12, Nick Edmonds wrote:
There is an in-place constructor to the CSR graph, but you'll need three
separate vectors (sources, targets, properties) rather than a vector of tuples. See the in-place ctors here <http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/ compressed_sparse_row.html>.
Thanks Nick, I know about the in-place constructor, and that is what I used before I needed a weighted graph.
But now I need to have weights on arcs, and since the arcs are chosen among N^2 node couples (and there is no way I can avoid checking N^2 node pairs), I opted for Intel's TBB.
It is convenient to use a TBB concurrent vector of tuples since push_back is guaranteed to work, and that is just what I need. However, if I switch to a concurrent vector for arcs and another one for weights, I might need a semaphore, hurting performances.
I think you could avoid a semaphore around the insertions for sure. Do you know the upper bound on the number of edges a priori? If so then you can guarantee that the vector doesn't realloc and just use an int-fetch-add (either an intrinsic or Boost.Atomic) to reserve indices in your vectors.
Moreover, I've been thinking about using a concurrent vector and a hash map from pairs to weights, but I don't know if this can be useful in a boost CSR constructor.
Eventually you're going to have to build a dense vector of edge weights if you want to use a CSR graph, either inside the CSR ctor or before you call it. The only way to avoid this that I can think of is to build the edge weight vector directly in your graph generation code. Whether that's more costly than using a temporary later depends on whether you have more time or memory to spare. I really hope I could get rid of these temporaries!
Thanks! _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

On 21/05/14 19:36, Nick Edmonds wrote:
I think you could avoid a semaphore around the insertions for sure. Do you know the upper bound on the number of edges a priori? If so then you can guarantee that the vector doesn't realloc and just use an int-fetch-add (either an intrinsic or Boost.Atomic) to reserve indices in your vectors.
That's a good idea, thank you, Nick! In this particular case, yes, I know that the number of edges is limited to 32, so I think I will use your suggestion.
Eventually you're going to have to build a dense vector of edge weights if you want to use a CSR graph, either inside the CSR ctor or before you call it. The only way to avoid this that I can think of is to build the edge weight vector directly in your graph generation code. Whether that's more costly than using a temporary later depends on whether you have more time or memory to spare.
Mmhh, I will benchmark this option. Weights are generated by a nice O(N^3) algorithm so I guess it's quite expensive either way. Thanks!
participants (2)
-
Nick Edmonds
-
Sensei