Boost logo

Boost :

From: jsiek_at_[hidden]
Date: 1999-12-07 17:04:59


Hi Dietmar,

Your last email was encouraging to me. With your new interface we are
now much closer to having a common interface. At one point in the
design of GGCL I considered an approach similar the yours using Graph
Objects.

In this email I'm going to focus in on the differences between GGCL
and Dietmar's current interface with regards to only the graph
structure interface. I'll try to reply to the other points in
different emails (in different threads as Dietmar suggested).

BTW, the naming scheme for variables I use is this:
e - a GGCL Edge object
u,v - GGCL Vertex objects

I. The Two Interfaces
II. Implications
  a) implementation issues
  b) when a graph object isn't needed
  c) prettiness and encapsulation
III. Miscellaneous Replies

I. The Two Interfaces
------------------------------------

Dietmar's new interface has three main concepts: NodePointer
(NP), EdgePointer (EP), and GraphObject (GO). The concepts actually
map quite directly to the GGCL concepts Vertex (V) and Edge (E).

V = NP + GO
E = EP + GO

That is, the "Vertex" object in GGCL plays the role of the NodePointer
and partially of the GraphObject of Dietmar's approach. Similarly for
GGCL Edge. To show in a bit more detail how this mapping works,
here it is for most of the operations:

> - NP is a node pointer type and np is an object of this type
> - EP is an edge pointer type and ep is an object of this type
> - NIt, EIt, IIt are the types of node, edge, and incidence iterators
> respectively
> - nit, eit, iit are node, edge, and incidence iterator, respectively
>
> Supported operations of a "graph object" 'go' of type 'GO':

> - np = go.tail(ep) // obtain a pointer to the start of the edge

maps to source(e)

> - np = go.head(ep) // obtain a pointer to the destination of the edge

maps to target(e)

> - iit = go.begin(np) // begin iterator for "incidence list" of the node
> - iit = go.end(np) // past the end iterator for this list

maps to adj(v).begin() and adj(v).end()

> - ep = go.to_edge_pointer(eit) // obtain an ede pointer from an iterator
> - ep = go.to_edge_pointer(iit) // obtain an ede pointer from an iterator
> - np = go.to_node_pointer(nit) // obtain a pnode ointer from an iterator

maps to operator*() for the iterator of an edgelist_type and vertexlist_type
of the Vertex concept.

The next few routines have mappings to GGCL, but I did not include
a description of them in my proposal to BOOST. The "g" object
is a GGCL graph object.

> - nit = go.nodes_begin() // begin iterator for the "list" of all nodes
> - nit = go.nodes_end() // past the end iterator for this "list"

maps to vertices(g).begin() and vertices(g).end()

> - eit = go.edges_begin() // begin iterator for the "list" of all edges
> - eit = go.edges_end() // corresponding past the end iterator

maps to edges(g).begin() and edges(g).end()

II. Implications
----------------

Now what are the implications of the two interfaces in terms of
implementation issues and interfacing & generic programming issues.
I'm going to argue that Dietmar's interface makes for easier
implementation, while the GGCL interface is more flexible (more
abstract) and has a "cleaner" look to it.

a) Implementation Issues

In Dietmar's interface a graph object takes part in every operation.
This makes for a very convenient way to access information that is
associated with the graph as a whole. Suppose we are implementing a
graph data struct using vector< slist<int> >. Here's what a partial
implementation might look like with Dietmar's approach:

struct graph_object
{
  typedef slist<int>::iterator incidence_iterator;
  typedef int node_pointer;

  incidence_iterator begin(node_pointer np) {
    return this->vec[np].begin();
  }
  incidence_iterator end(node_pointer np) {
    return this->vec[np].end();
  }
  node_pointer to_node_pointer(incidence_iterator i) {
    return *i;
  }
  vector< slist<int> > vec;
};

Here's what it might look like with the GGCL interface:

typedef slist<int>::iterator ListIter;
typedef vector< slist<int> > Vec;

struct vertex
{
  vertex(int np_, Vec& vec_) : np(np_), vec(vec_) { }
  int np;
  Vec& vec;
};
struct vertexlist_type
{
  struct iterator
  {
    typedef vertex value_type;
    iterator(Iter i, Vec& v) : iter(i), vec(v) { }
    vertex operator*() { return vertex(*iter, vec); }
    iterator& operator++() { ++iter; return *this; }

    Vec& vec;
    ListIter iter;
  };
  vertexlist_type(iterator s, iterator f, Vec& vec_)
     : start(s), finish(f), vec(vec_) { }
  iterator begin() { return iterator(start,vec); }
  iterator end() { return iterator(finish,vec); }

  iterator start, finish;
  Vec& vec;
};

vertexlist_type adj(vertex v) {
  return vertexlist_type(v.vec[v.np].begin(),v.vec[v.np].begin(), v.vec);
}

So obviously there is more work to do when creating the GGCL
interface for this type of graph representation.

b) When a Graph Object Isn't Needed

One problem I have with the graph object approach is that there are
some kinds of graph representations for which there is no need for a
graph object. This is true of the kind of graph structure that Alex
Stepanov's connected components algorithm works on. Here's a little
example of setting up a graph structure, and then calling his
algorithm:

  typedef int Index; // ID of a Vertex
  typedef pair<Index,Index> Edge;
  const int N = 6;
  const int E = 4;
  Edge edgelist[] = { Edge(0, 1), Edge(1, 4), Edge(4, 0), Edge(2, 5) };
  
  std::vector<Index> c;
  connected_components_on_edgelist(c, edgelist, edgelist + E);

And here's the top-level function of Alex's algorithm:

template <class IndexContainer, class EdgeI_InputIterator>
void connected_components_on_edgelist(IndexContainer& c,
                                      EdgeI_InputIterator first,
                                      EdgeI_InputIterator last) {
  std::vector<unsigned char> rank;
  typedef typename IndexContainer::iterator IndexIter;
  typedef typename IndexContainer::value_type Integer;
  component_representative_with_path_halving<IndexIter,Integer> comp_rep;
  for (; first != last; ++first)
    connect_components(c, rank, source(*first), target(*first), comp_rep);
}

With Dietmar's interface, one would have to create a graph object
class that basically doesn't do anything, and then pass it around in
the algorithms.

c) Prettiness and Encapsulation

So far the main advantage of Dietmar's interface is that it makes the
graph structure easier to implement in some situations. It does this
by having a graph object participate in every operation. As such I
would argue that the presense of the graph object is really an
implementation detail for some graph types, and therefore does not
belong in the interface used by generic graph algorithms. I think it
is better to *encapsulate* the existence (or non-existence) of the
graph object underneath the Vertex and Edge concepts.

One nice side effect of having the Vertex and Edge concepts, and
removing the graph object, is that the algorithms look more like the
pseudo-code found in text books and papers, which means that it is a
closer match to the problem domain.

As Dietmar mentioned previously, the difficulty of implementation
should be of a lessor concern than the other interfacing issues.
Therefore I would encourage the move to the slightly more abstract
but less easy to implement interface of GGCL.

III. Miscellaneous Replies
--------------------------

> - Most interesting graph algorithms associate more than just one piece
> of data with an object like a node or an edge, eg. in flow algorithms
> there are upper and lower flow constraints as well as the current flow
> associated with each edge. Which of these shall 'operator*()' access?
> Basically, 'operator*()' addresses a similar problem that the decorators/
> data accessors are intended to solve!

In GGCL, the operator*() does not pick "just one peice of data" for the
return value. Instead it returns a Vertex, which is something like the
NodePointer of Dietmar's interface. The Vertex can then be used with a
data accessors to get different peices of data.

> - 'operator*()' invariably returns a reference to some data, at least under
> the current rules in the STL: The requirements on STL iterators
> basically prohibit the use of a proxy class. Thus, 'operator*()' definitely
> needs to return a reference to an object to be conforming to the STL
> requirements. This implies serious restrictions (see below: operator[]()
> vs. get()/set()).

I suppose one could consider the typical Vertex of GGCL a proxy class,
but I do not see the problem with that, nor how that violates STL
requirements. In fact, most standard implementations of
ostream_iterator use a proxy class (the SGI version uses the
ostream_iterator itself as a proxy).

Cheers,

Jeremy

----------------------------------------------------------------------
 Jeremy Siek
 Ph.D. Candidate email: jsiek_at_[hidden]
 Univ. of Notre Dame work phone: (650) 933-8724
 and cell phone: (415) 377-5814
 C++ Library & Compiler Group fax: (650) 932-0127
 SGI www: http://www.lsc.nd.edu/~jsiek/
----------------------------------------------------------------------


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk