From: Tiago de Paula Peixoto (tiago_at_[hidden])
Date: 2007-04-12 08:29:56


I believe the modifications done in

break xml namespace processing, since it will strip any namespace
information from all tags (everything before "|"), not only those which
belong to the graphml namespace. So, tags belonging to other namespaces,
such as "foo:node", will be wrongly parsed as graphml tags, eg. "node"...

I'm sending attached patches which revert to the old behavior and have
two further modifications:

- Better expat error handling. It now informs the user where in the file
a problem occured
- Reordering of tag processing for possible speed improvement ("node" is
more frequent than "graph", so it should be tested first, etc)

I'm also sending slightly improved documentation and test case.

Besides this I have a couple of questions:

1- How should graph properties be handled? I did include support for it,
using dynamic_properties map with a key type equal to the graph type,
but I don't know if that's the proper way to do it.

2- Some of the exceptions are shared with the graphviz reader, but those
exceptions have the name "graphviz" in their error strings, and thus
create confusing messages when used with the graphml reader. Perhaps
those strings should be edited in the graphviz code?


Tiago de Paula Peixoto <tiago_at_[hidden]>

  void read_graphml(std::istream& in, MutableGraph& graph,
                    dynamic_properties& dp);

The ``read_graphml`` function interprets a graph described using the
graphml_ format and builds a BGL graph that captures that
description. Using this function, you can initialize a graph using
data stored as text.

The graphml format can specify both directed and undirected graphs, and
``read_graphml`` differentiates between the two. One must pass
``read_graphml`` an undirected graph when reading an undirected graph;
the same is true for directed graphs. Furthermore, ``read_graphml``
will throw an exception if it encounters parallel edges and cannot add
them to the graph.

To handle attributes expressed in the graphml format, ``read_graphml``
takes a dynamic_properties_ object and operates on its collection of
property maps. The reader passes all the properties encountered to
this object, using the graphml attribute names as the property names,
and with the appropriate C++ value type based on the graphml attribute type
definition. Graph properties are also set with the same
dynamic_properties object, where the key type is the type of the graph itself.

 - The type of the graph must model the `Mutable Graph`_ concept.
 - The type of the iterator must model the `Multi-Pass Iterator`_
 - The property map value types must be default-constructible.

Where Defined



  struct graph_exception : public std::exception {
    virtual ~graph_exception() throw();
    virtual const char* what() const throw() = 0;

  struct bad_parallel_edge : public graph_exception {
    std::string from;
    std::string to;

    bad_parallel_edge(const std::string&, const std::string&);
    virtual ~bad_parallel_edge() throw();
    const char* what() const throw();

  struct directed_graph_error : public graph_exception {
    virtual ~directed_graph_error() throw();
    virtual const char* what() const throw();

  struct undirected_graph_error : public graph_exception {
    virtual ~undirected_graph_error() throw();
    virtual const char* what() const throw();

  struct parse_error : public graph_exception {
    parse_error(const std::string&);
    virtual ~parse_error() throw() {}
    virtual const char* what() const throw();
    std::string statement;
    std::string error;

Under certain circumstances, ``read_graphml`` will throw one of the
above exceptions. The three concrete exceptions can all be caught
using the general ``graph_exception`` moniker when greater precision
is not needed. In addition, all of the above exceptions derive from
the standard ``std::exception`` for even more generalized error

The ``bad_parallel_edge`` exception is thrown when an attempt to add a
parallel edge to the supplied MutableGraph fails. The graphml format
supports parallel edges, but some BGL-compatible graph types do not.
One example of such a graph is ``boost::adjacency_list<setS,vecS>``,
which allows at most one edge can between any two vertices.

The ``directed_graph_error`` exception occurs when an undirected graph
type is passed to ``read_graph``, but the graph defined in the graphml
file contains at least one directed edge.

The ``undirected_graph_error`` exception occurs when a directed graph
type is passed to ``read_graph``, but the graph defined in the graphml
file contains at least one undirected edge.

The ``parse_error`` exception occurs when a syntax error is
encountered in the graphml file. The error string will contain the
line and column where the error was encountered.

Building the graphml reader
To use the graphml reader, you will need to build and link against
the "bgl-graphml" library. The library can be built by following the
`Boost Jam Build Instructions`_ for the subdirectory ``libs/graph/build``.


 - On successful reading of a graph, every vertex and edge will have
   an associated value for every respective edge and vertex property
   encountered while interpreting the graph. These values will be set
   using the ``dynamic_properties`` object. Some properties may be
   ``put`` multiple times during the course of reading in order to
   ensure the graphml semantics. Those edges and vertices that are
   not explicitly given a value for a property (and that property has
   no default) will be given the default constructed value of the
   value type. **Be sure that property map value types are default

 - Nested graphs are supported as long as they are exactly of the same
   type as the root graph, i.e., are also directed or undirected. Note
   that since nested graphs are not directly supported by BGL, they
   are in fact completely ignored when building the graph, and the
   internal vertices or edges are interpreted as belonging to the root

 - Hyperedges and Ports are not supported.

  template<typename Graph>
  write_graphml(std::ostream& out, const Graph& g, const dynamic_properties& dp,
                bool ordered_vertices=false);

  template<typename Graph, typename VertexIndexMap>
  write_graphml(std::ostream& out, const Graph& g, VertexIndexMap vertex_index,
                const dynamic_properties& dp, bool ordered_vertices=false);

This is to write a BGL graph object into an output stream in the
graphml_ format. Both overloads of ``write_graphml`` will emit all of
the properties stored in the dynamic_properties_ object, thereby
retaining the properties that have been read in through the dual
function read_graphml_. The second overload must be used when the
graph doesn't have an internal vertex index map, which must then be
supplied with the appropriate parameter.

OUT: ``std::ostream& out``
   A standard ``std::ostream`` object.

IN: ``VertexListGraph& g``
  A directed or undirected graph. The
  graph's type must be a model of VertexListGraph_. If the graph
  doesn't have an internal ``vertex_index`` property map, one
  must be supplied with the vertex_index parameter.

IN: ``VertexIndexMap vertex_index``>
  A vertex property map containing the indexes in the range

IN: ``dynamic_properties& dp``
  Contains all of the vertex, edge and graph properties that should be
  emitted by the graphml writer.

IN: ``bool ordered_vertices``
  This tells whether or not the order of the vertices from vertices(g)
  matches the order of the indexes. If ``true``, the ``parse.nodeids``
  graph attribute will be set to ``canonical``. Otherwise it will be
  set to ``free``.


This example demonstrates using BGL-graphml interface to write
a BGL graph into a graphml format file.


  enum files_e { dax_h, yow_h, boz_h, zow_h, foo_cpp,
                 foo_o, bar_cpp, bar_o, libfoobar_a,
                 zig_cpp, zig_o, zag_cpp, zag_o,
                 libzigzag_a, killerapp, N };
  const char* name[] = { "dax.h", "yow.h", "boz.h", "zow.h", "foo.cpp",
                         "foo.o", "bar.cpp", "bar.o", "libfoobar.a",
                         "zig.cpp", "zig.o", "zag.cpp", "zag.o",
                         "libzigzag.a", "killerapp" };

  int main(int,char*[])
      typedef pair<int,int> Edge;
      Edge used_by[] = {
          Edge(dax_h, foo_cpp), Edge(dax_h, bar_cpp), Edge(dax_h, yow_h),
          Edge(yow_h, bar_cpp), Edge(yow_h, zag_cpp),
          Edge(boz_h, bar_cpp), Edge(boz_h, zig_cpp), Edge(boz_h, zag_cpp),
          Edge(zow_h, foo_cpp),
          Edge(foo_cpp, foo_o),
          Edge(foo_o, libfoobar_a),
          Edge(bar_cpp, bar_o),
          Edge(bar_o, libfoobar_a),
          Edge(libfoobar_a, libzigzag_a),
          Edge(zig_cpp, zig_o),
          Edge(zig_o, libzigzag_a),
          Edge(zag_cpp, zag_o),
          Edge(zag_o, libzigzag_a),
          Edge(libzigzag_a, killerapp)

      const int nedges = sizeof(used_by)/sizeof(Edge);

      typedef adjacency_list< vecS, vecS, directedS,
          property< vertex_color_t, string >,
          property< edge_weight_t, int >
> Graph;
      Graph g(used_by, used_by + nedges, N);

      graph_traits<Graph>::vertex_iterator v, v_end;
      for (tie(v,v_end) = vertices(g); v != v_end; ++v)
          put(vertex_color_t(), g, *v, name[*v]);

      graph_traits<Graph>::edge_iterator e, e_end;
      for (tie(e,e_end) = edges(g); e != e_end; ++e)
          put(edge_weight_t(), g, *e, 3);

      dynamic_properties dp;"name", get(vertex_color_t(), g));"weight", get(edge_weight_t(), g));

      write_graphml(std::cout, g, dp, true);

The output will be:


  <?xml version="1.0" encoding="UTF-8"?>
  <graphml xmlns="" xmlns:xsi="" xsi:schemaLocation="">
    <key id="key0" for="node""name" attr.type="string" />
    <key id="key1" for="edge""weight" attr.type="int" />
    <graph id="G" edgedefault="directed" parse.nodeids="canonical" parse.edgeids="canonical" parse.order="nodesfirst">
      <node id="n0">
        <data key="key0">dax.h</data>
      <node id="n1">
        <data key="key0">yow.h</data>
      <node id="n2">
        <data key="key0">boz.h</data>
      <node id="n3">
        <data key="key0">zow.h</data>
      <node id="n4">
        <data key="key0">foo.cpp</data>
      <node id="n5">
        <data key="key0">foo.o</data>
      <node id="n6">
        <data key="key0">bar.cpp</data>
      <node id="n7">
        <data key="key0">bar.o</data>
      <node id="n8">
        <data key="key0">libfoobar.a</data>
      <node id="n9">
        <data key="key0">zig.cpp</data>
      <node id="n10">
        <data key="key0">zig.o</data>
      <node id="n11">
        <data key="key0">zag.cpp</data>
      <node id="n12">
        <data key="key0">zag.o</data>
      <node id="n13">
        <data key="key0">libzigzag.a</data>
      <node id="n14">
        <data key="key0">killerapp</data>
      <edge id="e0" source="n0" target="n4">
        <data key="key1">3</data>
      <edge id="e1" source="n0" target="n6">
        <data key="key1">3</data>
      <edge id="e2" source="n0" target="n1">
        <data key="key1">3</data>
      <edge id="e3" source="n1" target="n6">
        <data key="key1">3</data>
      <edge id="e4" source="n1" target="n11">
        <data key="key1">3</data>
      <edge id="e5" source="n2" target="n6">
        <data key="key1">3</data>
      <edge id="e6" source="n2" target="n9">
         <data key="key1">3</data>
      <edge id="e7" source="n2" target="n11">
        <data key="key1">3</data>
      <edge id="e8" source="n3" target="n4">
        <data key="key1">3</data>
      <edge id="e9" source="n4" target="n5">
        <data key="key1">3</data>
      <edge id="e10" source="n5" target="n8">
        <data key="key1">3</data>
      <edge id="e11" source="n6" target="n7">
        <data key="key1">3</data>
      <edge id="e12" source="n7" target="n8">
        <data key="key1">3</data>
      <edge id="e13" source="n8" target="n13">
        <data key="key1">3</data>
      <edge id="e14" source="n9" target="n10">
        <data key="key1">3</data>
      <edge id="e15" source="n10" target="n13">
        <data key="key1">3</data>
      <edge id="e16" source="n11" target="n12">
        <data key="key1">3</data>
      <edge id="e17" source="n12" target="n13">
        <data key="key1">3</data>
      <edge id="e18" source="n13" target="n14">
        <data key="key1">3</data>

