|
Boost : |
From: Aaron Windsor (aaron.windsor_at_[hidden])
Date: 2007-06-16 11:24:07
On 6/15/07, Andrew Sutton <asutton_at_[hidden]> wrote:
> All Interested,
>
> For my Summer of Code proposal, I have been working on the usability
> of the Boost.Graph library, basically adding a couple wrappers to
> make it easier to work with undirected and directed graphs. One
> aspect of these classes is that they do not use the typical vector
> storage for vertices, edges, or the out edge list. In this case, list
> storage provides more iterator and descriptor stability so we (mentor
> Jeremy Siek and myself) decided go that direction.
>
> This decision has some interesting fallout since almost every example
> for Boost.Graph, including that in the documentation is written using
> vecS for most of the storage facilities. This lead some interesting
> discoveries like, for example, that num_vertices(g) is O(n) with
> listS vertex storage under GCC - who knew?. also that most graphs
> declared using `listS` will fail to run correctly with default
> parameters on just about every algorithm. And finally, that most of
> the examples given in the documentation and libs directory will also
> fail if you stick this line of code just before a call to any core
> algorithm:
>
> remove_vertex(*vertices(g).first, g);
>
> Except for the linear time num_vertices(g) call, these problems are
> caused by indiscriminate use of vertex descriptors as indices - in
> fact, that's exactly the reason that the above line of code will
> cause most examples to fail.
>
> I've written down a solution to this and added it to the
> documentation I've been working for my project. It essentially
> involves three functions that can be used to help manage vertex
> indices. Essentially they are:
>
> get_vertex_index(v, g)
> max_vertex_index(g)
> renumber_vertex_indidces(g)
>
> Correct use of these functions will solve a number of problems. The
> more complete proposal is in subversion at:
>
> http://svn.boost.org/trac/boost/browser/sandbox/SOC/2007/graphs/libs/
> graph/doc/quickbook/indexing.qbk
>
> Part the related implementation is here:
>
> http://svn.boost.org/trac/boost/browser/sandbox/SOC/2007/graphs/boost/
> graph/undirected_graph.hpp
>
> The three functions in question are the last in the file, although
> they do call some member functions of the graph class itself.
>
> I'd be grateful to some suggestions for improvement or other comments.
>
> Andrew Sutton
> asutton_at_[hidden]
Hi Andrew,
Glad to see that you are making progress on this problem - I think
improving the usability of the BGL for the casual user is a very
worthwhile project.
Your implementation makes it easier for users to manage a vertex/edge
index, but I'd like to suggest a scheme where the user could be
blissfully unaware of the existence of a vertex/edge index entirely
unless they explicitly needed it. To me, this is the big usability
problem with the BGL - the fact that a user needs to understand the
concept of a vertex/edge index map before even simply using a basic
algorithm like breadth-first search. But almost all BGL algorithms
make use of at least a vertex index, and a large percentage of the
BGL-related questions on the boost-users list deal with
misunderstanding/misuse of vertex and edge index maps. I think it
would be nice if the undirected_graph and directed_graph class came
equipped with a ready-to-use vertex_index and edge_index, so that when
a user with no prior knowledge of the BGL codes up an example like the
one in your proposal:
vertex_descriptor u = add_vertex(g);
vertex_descriptor v = add_vertex(g);
remove_vertex(u, g);
breadth_first_search(g, v, make_bfs_visitor(null_visitor()));
everything just works. It's possible to do this lazily with very
little overhead using the following scheme (I'll describe the scheme
for a vertex index, but the same scheme works for an edge index as
well): give the graph class two additional members:
global_vertex_timestamp and next_vertex_index, and give each vertex
two interior properties: raw_vertex_index and local_vertex_timestamp.
All of these variables are of type
graph_traits<graph_type>::vertices_size_type. Upon construction of the
graph, global_vertex_timestamp is initialized to 1, next_vertex_index
is initialized to 0, and each vertex's local_vertex_timestamp is
initialized to 0.
The property map p returned by get(vertex_index, g) is a read-only
property map that does the following: when get(p, v) is called for
some vertex v, first check to see if v's local_vertex_timestamp is
equal to the graph's global_vertex_timestamp. If so, return
raw_vertex_index. If not, set local_vertex_timestamp =
global_vertex_timestamp, set raw_vertex_index = next_available_index,
and increment next_available_index. This scheme indexes the vertices
"on-demand", and allows you to lazily reset *all* vertex indices at
once in constant time (for example, whenever remove_vertex is called)
just by incrementing global_vertex_timestamp and setting
next_vertex_index to 0.
One wrinkle in the above scheme is that you'd need to do something
special in the rare case that the global_vertex_timestamp overflowed
back to 0. In this case, you could just check after each increment of
global_vertex_timestamp to see if it's equal to the max() value
defined in std::numeric_limits, and if so, do a full reset
(global_vertex_timestamp = 1, next_vertex_index = 0, each
local_vertex_timestamp = 0).
It would also have to be documented that the indices are invalidated
when a vertex is removed, since the vertex index can be used to create
exterior property maps (for example, a vector indexed by the vertex
index). This could be a very confusing error - all of the sudden, the
mapping from vertex -> external property would be permuted (since the
vertices are re-indexed) if a vertex is removed.
Just a different philosophy - whether you want to make vertex/edge
indices easier to use, or whether you want to try to sweep them under
the rug entirely for casual users of the BGL. If a user is going to do
anything advanced with the BGL, they're going to have to understand
vertex/edge indices eventually, but anything we can do to make
everything compile and run for the first time/casual user is an
improvement!
Regards,
Aaron
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk