Boost logo

Boost :

From: Chuck Messenger (chuckm_at_[hidden])
Date: 2003-05-29 13:19:07


Peter Dimov wrote:
...
> This is a graph. The classic "tree structure" combined with reference
> counting can only represent (possibly multiply linked) DAGs, but not general
> graphs.
>
> One possible representation of a general graph is a vertex pool and a
> separate edge pool. This avoids the ownership issues inherent in the tree
> representation since now nodes don't own each other, the two pools are the
> owners.

Right -- which boils down to implementing your own framework-specific
garbage collector. I'm not sure why you need to split it into separate
vertex and edge pools, though. In my example:

     {
         Node n1(1);

         {
             Node n2(2), n3(3), n4(4);

             n1.Connect(n2);
             n2.Connect(n2);
             n3.Connect(n3);
             n4.Connect(n4);
         }

         cout << "n1 points to the cluster {n2, n3, n4}\n";
     }

     cout << "n2, n3, n4 remain alive. That sucks.\n";

what would happen is that the NodeImpl's would be allocated out of a
NodeImpl pool. Whenever a NodeImpl is destructed, garbage may have been
created. The only way to find the garbage is to traverse the entire
graph. As in sp_collector, we need to compute, for each Node, the total
"internal" reference count. When a Node's internal count equals its
total count, we collect it. The question is: how do you traverse the
internal graph?

That's where sp_collector.cpp's trick comes in handy: given an arbitrary
object (i.e. casted to char*), you can detect any references in it.

You might say: why not simply provide a "traverse()" function for each
structure holding a Node? That's possible, but very ugly. Suppose I
add a new Node to the structure -- I need to remember to add it to my
traverse() function. That's exactly the sort of thing I want to avoid.
  It's the exact same problem created by cyclic_ptr's "*p = *p" trick
for discovering references, in cases where *p contains one or more
non-copyable members (and therefore, you need to hand-craft a copy
operation).

However, your vertex/edge notion has got me thinking. The invariant
would work this way: a VertexImpl (e.g. NodeImpl) may contain any number
of Vertex's (e.g. Node's) directly. However, a Vertex must *only* exist
either directly in a VertexImpl, or in an Edge (in my Network Simulation
Library, I'd call these Connection's). Edge's may exist anywhere,
including STL containers. In my example, I'd have:

     struct Connection {
         Connection(const Node& from, const Node& to)
                                 : from_(from), to_(to) { }
         Node from_;
         Node to_;
     };

     struct NodeImpl : noncopyable {
         NodeImpl(int id) : id_(id) { cout << id_ << " lives\n"; }
         ~NodeImpl() { cout << id_ << " dies\n"; }
     private:
         int id_;
         set<Connection> connections_;
         friend struct Node;
     };

     void Node::Connect(const Node& n) {
         pimpl_->connections_.insert(Connection(*this, n));
     }

Each Edge is strictly owned by one VertexImpl, in the traditional
reference-counting manner. When a VertexImpl dies, all its Edge's die
-- by traditional strict ownership semantics (so, they can be in STL
containers or just about anywhere else). The beauty is that these
Edge's advertise their death. Hence, the dependency graph can be
maintained, so that garbage collection is possible.

The trick is to remember about putting Connection's in STL containers in
place of Node's. It's messy, but at least it's something. It helps
clarify my thinking about the issues.

Very nice. Thanks.

And yes, I have heard of, and experimented with, the BGL. But it's way
too heavy-artillery for my needs here...

     - Chuck Messenger


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk