Boost logo

Boost :

From: jsiek_at_[hidden]
Date: 1999-12-07 00:09:43


I'm afraid this is email is a bit long, so here's a TOC :)

I. Iterators & Graph Traversal
II. Data Accessors & GGCL Decorators
III. Algorithm Iterators & GGCL Visitors
IV. Stepanov, Connected Components, Edges and the Bracket Operator
V. Implementation Experiments

Preface
-------

One thing that should be carefully separated in this discussion of
graph algorithm interface are the things that are merely syntactic
differences, and the things which are fundamental differences. What
I'm going to argue is that many of the differences between Dietmar's
approach and GGCL is syntactic. However, this is not to belittle
syntax, it is important for lots of technical reasons, especially
interoperability with STL.

Also, I appreciate that both Dietmar and I have invested time in
different interfaces so it may not be easy for either of us to be
unbiased. Keeping this in mind, as I make arguments I'll try to be
very specific so that each point can be easily evaluated.

I. Iterators & Graph Traversal
------------------------------------

First to address the differences in the interface for traversing
through a graph with GGCL and with Dietmar's inteface... Dietmar,
would you mind sharing what your current thinking is for your iterator
interface? You gave a text description in your email, but not the
details of the interface. I'm going to try to give a comparison here
anyways, based on your previous publications (your thesis and some
notes from the Dagstuhl conferences). If I misrepresent your
interface I'm sorry!

What I think the main difference is that in Dietmar's approach, one
can obtain an adjacency iterator directly from another adjacency
iterator with something like this:

adj_iter2 = adj_iter1.cur_adj();

In GGCL the equivalent statement would be:

adj_iter2 = adj(*adj_iter1).begin();

where the dereference gives a vertex object, and then the adj()
function gives the out-edge "list".

This is a good time to address Dietmar's complaint about the Vertex
and Edge concepts of GGCL. Dietmar stated that graph algorithms don't
ever need to deal with actual vertices and edges. I agree with him!
Perhaps he is reading more into the Vertex and Edge concepts than what
I've defined. (granted, "Vertex" and "Edge" are heavily loaded terms)
The Vertex concept of GGCL consists of nothing but functionality for
accessing adjacency information, so they actually don't add any more
requirements or functionality than the cur_adj() function of his
interface. One may then ask, why bother introducing the Vertex
concept if one can make do by adding an extra method such as cur_adj()
to the iterator?

There are several reasons for this. The first is that I believe it is
better to keep concepts small and simple, with a single purpose.
(this design philosophy originated with procedural programming, but it
is even more important in object-oriented and generic programming).
Iterators in essense are generalized pointers, things with operator++
and operator*. I believe the conceptual integrity is weakened if you
add or remove operations (adding cur_adj() and removing operator*).

The change of the iterator interface also breaks interoperability with
STL algorithms and other generic algorithms. These algorithms all
assume an iterator has the operator*. With the GGCL interface,
iterators and just plain old iterators (usually forward iterators),
and the extra operations are split out into the Vertex concept.

The third reason is that the presense of vertex and edge objects (even
if they don't do very much) creates an interface that looks more
"natural". It's a closer fit to the problem domain. The vertex and
edge interface objects are really just placeholder for the
*underlying* concepts. The next paragraph will make this more clear...

One perhaps non-obvious fact about the vertex and edge interface
objects of GGCL is that they don't have to be persistent (on the
heap), and in fact most efficient graph representations don't store
"vertex" objects per se. In the GGCL implementation, the vertex and
edge objects are typically very light-weight/stack based objects that
are created "on-demand". They really only exist for purposes of the
interface. My nickname for this kind of object is a "Mayfly".

One possible argument against the vertex and edge objects would be
that perhaps they create extra run-time overhead. This may be true of
some compilers, but with a good modern optimizing compiler they
introduce literally zero overhead. We have several sparse matrix
ordering algorithms implemented using GGCL that match the original
Fortran codes! (using the KAI C++ compiler).

About the ContainerRef concept in GGCL... The purpose of this concept
is solely to define the kind of thing returned by the out_edges(v)
function (v is a vertex) and the adj(v) function. As I mentioned, it
is something that has a begin() and end(), which return iterators.
That's it. I'm afraid the use of the word "Container" in the name
brought along too much baggage. One of Dietmar's suggestions, about a
data-accessor that returns "both the start and end iterator", is
essentially equivalent. I am not adding any additional requirements.
The iterators of the ContainerRef are forward iterators, just as
Dietmar's are.

II. Data Accessors & GGCL Decorators
------------------------------------

Dietmar and I agree that data accessors (GGCL decorators) are hugely
important to graph algorithms. There really is only a syntactic
difference here, and perhaps also some implementation flexibility
issues. GGCL uses operator[]. Dietmar uses set() and get(). I agree
that set() and get() are more flexible implementation wise, and I'm
not entirely opposed to them. I would just be a bit sad to see the
"for free" interoperability with random-access iterators (and builtin
arrays) go away. (more about this in section IV.) Also, the bracket
is nicer from a "pretty code" point of view, but this of course is a
smaller concern. In addition, one can write read-only and write-only
versions of a bracket operator, so I'm not entirely convinced of its
inflexibility.

About the VertexWithID... I appreciate Dietmar's comment. We
actually started out requiring access to the index of a vertex through
a decorator (data accessor). With something like id[v]. Its
definitely more orthogonal that way. We moved away from it due to
some implementation issues, but it would not be hard to switch back...
in fact I think the ID decorator (data accessor) is still used in
places within GGCL.

III. Algorithm Iterators & GGCL Visitors
----------------------------------------

Next issue: algorithm iterators. I think they are a neat idea. Their
motivation is the same motivation as for the Visitors in GGCL. Since
algorithm iterators and Visitors are more about algorithm
organization, I think it would be safe to postpose discussion on this
issue until the core graph iterface is defined.

On a side note, I don't understand Dietmar's reaction against
"call-backs". The STL uses the modern form of call-backs (functors)
all over the place, and they seam to work very well.

IV. Stepanov, Connected Components, Edges and the Bracket Operator
------------------------------------------------------------------

As to Mr. Stepanov. Yes I've shared this interface with him. For
the first 20 minutes of my presentation he ripped it apart (he likes
to do that you see). However, then he realized that I was just using
different names, and slightly different syntax for the same stuff he
had been thinking of, he came around and started to like the
interface. In fact, one of the GGCL algorithms is written by Alex.
The connect_components_on_edgelist() algorithm is his creation. That
algorithm is of particular interest for several reasons. First, he
developed the algorithm long before I ever came up with the GGCL
interface, but luckily "porting" the algorithm to GGCL was completely
trivial. We replaced "first(e)" and "second(e)" with "source(e)" and
"target(e)" (where e is an edge). That's it! We're talking pages of
code, and that was the only change. (I also did a search and replace
converting the use of "Node" to "Vertex" in the template argument
names). This algorithm is also a good example of why the Edge concept
is important. The algorithm operates on an iterator-range of Edges.
Finally, another big reason the "port" was so easy was that his
algorithm already used the bracket operator as the method for
accessing vertex properties (the concrete implementation he had in
mind was using int's for vertices and to store the properties in
vectors).

V. Implementation Experiments
-----------------------------

Dietmar's idea of doing some implementation experiments is great. I'd
be happy to do up some examples in the GGCL interface and set up
timing experiments, etc.

Cheers,

Jeremy

----------------------------------------------------------------------
 Jeremy Siek
 Ph.D. Candidate email: jsiek_at_[hidden]
 Univ. of Notre Dame work phone: (650) 933-8724
 and cell phone: (415) 377-5814
 C++ Library & Compiler Group fax: (650) 932-0127
 SGI www: http://www.lsc.nd.edu/~jsiek/
----------------------------------------------------------------------


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk