|
Boost : |
Subject: Re: [boost] Formal Review: Boost.Polygon starts today August 24, 2009
From: Barend Gehrels (barend_at_[hidden])
Date: 2009-08-31 18:33:50
>
> The formal review of the Boost.Polygon library by Lucanus Simonson
> starts today, August 24, 2009 and will finish September 2, 2009.
>
First, congratulations Luke, and also Gyuszi, with having finished this
library and having it for review.
As one of the authors of another Geometry Library, also aimed for Boost,
I'll try to be as objective as possible. I'll not compare it too much to
our library, though I'll give some performance measures compared with
many other libraries, also ours (ggl).
When I started testing Boost.Polygon, in February or March, there was
not yet any documentation, it was difficult to write the benchmarks, it
didn't compile using MSVC 2005. That is all solved now, it is
documented, it compiles using MSVC 2005 now, there are some positive
reviews on design, implementation and the paper. So this is a good step
forward.
In my domain (GIS) we, normally, only use floating point coordinates on
arbitrary angle polygons. I'll not review the 45/90 polygons because I
don't know them and I'm not interested in them.
Expectations: I think a templated Boost.Polygon library should at least:
1) support most of the basic polygon geometry algorithms
2) have a performance comparable to other libraries
3) support both integer and floating point coordinate types
1) the library supports many operations, but some basic polygon
algorithms are missing:
- convex hull
- centroid
- is_convex
Those operations are found in many libraries and I think they should be
supported by Boost.Polygon. It would also make the performance
comparison (even) more interesting.
2) though some reviewers were satisfied about the performance, and the
paper indicates it is well performing, our benchmark shows that it is in
general slower or much slower than other libraries:
- the performance of the "contains" operation is perfect, fastest available
- the "area" operation is about a factor 1.5 slower than most other
libraries. This is surprising because "area" is quite a simple
operation, it should be possible to implement it with the same
performance as e.g. CGAL or wykobi.
- the "intersection" operation is too slow compared with other
libraries. The paper (referred to in the documentation) compares the
intersection performance with GPC and CGAL, but does not give exact
figures. It gives some plots about scalability using a very high number
of points per polygon.
In normal usage, in nearly all cases, GPC outperforms Boost.Polygon with
a high factor:
--> GPC: 9.0 seconds, GTL/Boost.Polygon: 92.2 seconds for the same
operation (for 100 points per overlayed ellipse)
--> GPC: 13.7 seconds, GTL/Boost.Polygon: 115.3 seconds for the same
operation (1000 points per overlayed ellipse, which is already large).
--> Also with 10000 points GPC is still much faster (factor 5).
--> Only in high-intersection-densities with a very high number of
points GTL is faster than GPC.
If I was a user and license would allow me, I would definitely take GPC
and not BP, because of performance and because of floating point...
--> compared with CGAL: in normal usage, GTL is indeed faster. But in
intersection-dense environments CGAL scales better (this in contrary to
the paper)
--> compared with GGL: in normal usage Boost.Polygon is a factor 81.1
slower than GGL. While it was suggested that we should use BP for our
boolean operations, and we evaluated this (resulting in this
comparison), we're glad that we first compared it to other libraries
because a user can wait for 1 second, but cannot wait for 92 seconds...
for a less precise result... In all tests GGL is way faster.
--> for the complete overview, see
http://trac.osgeo.org/ggl/wiki/Performance and
http://trac.osgeo.org/ggl/wiki/IntersectionMore
Most of our benchmark results are contrary to the results of your paper.
I think that if you write things like this, you should publish the
benchmark sources because it is very sensitive information. People
should be able to reproduce them. Also exact figures in the paper would
help to interpret the results.
The Boost.Polygon library is specialized for boolean operations and the
paper is all about performance. But in the end it is really not
performing well...
3) Boost.Polygon supports integer but does not support double for
boolean operations.
For the operations "area" and "contains", coordinates in "double"
floating precision are supported. Also the boolean operations accept
double coordinates, but crashes immediately. I think it should not
crash, slightly better is to prevent floating point coordinates by
giving a compilation error. Much better: floating point coordinates
should be implemented. For me as GIS-user, a library without floating
point support is unusable.
For the comparison I converted to integer coordinates (multiplying using
a factor of 10000.0). But in the resulting summed area you can see that
this is not working perfectly. The area differs, and where it should
increase (star-ellipse with more and more points) it is decreasing... I
tried other integer-double-conversion factors as well but that made
things worse.
>
>
> - What is your evaluation of the design?
It is difficult to answer this question because I (with Bruno, and some
folks of this list) designed another library in another way, it is hard
to be objective here. I would prefer a geometry library, instead of a
polygon library. Polygons come seldom alone. It would be useful to have
other geometry types as well.
The 45/90 polygons could be implemented as specializations (special
cases) of the normal polygons, overriding implementations. This library
is designed the other way round, it started with 90/45 polygons and
arbitrary polygons were added to be submitted as a boost library.
The diagram shows the polygon_90 as a refinement to a rectangle, the
polygon_45 as a refinement to a polygon_90, the polygon as a refinement
to a polygon_90, and a polygon_45_with_holes as a refinement to both a
polgon_90_with_holes and a polygon_45. All this makes really no sense to
me, I think it could and should be much more simple.
> - What is your evaluation of the implementation?
It is curious that Boost.Polygon went from Tag Dispatching to SFINAE
last year, while we (GGL) went from SFINAE to Tag Dispatching. We're
very glad with our change, while most compiler problems and warnings in
Boost.Polygon origin from the SFINAE implementation. I'm still convinced
that Tag Dispatching, suggested to us on this list by Dave Abrahams, is
a very good choice in a geometry library implementation. We're happy
with it and all compiler and compatability problems are vanished.
Things are copied from other boost libraries. File isotropy.hpp contains
a nearly literal copy of enable_if. There is also an extraction of MPL.
While this is an explicit choice, I don't think it is a good choice. The
Boost Concept Check Library is not used to check concepts.
I didn't try everything, but I'm getting the feeling that the concepts
are not implemented as they should be. The file polygon_set_concept.hpp,
which should give an outline of the polygon set concept, is dependant on
polygon_set_data.hpp (which implements that concept). That should be the
other way round. Inside that concept file, points are often copied into
a polygon_set_data temporary. That is not how concepts should be
implemented, and is very bad for performance.
The file transform.hpp contains a large number of enumarions for
transforming polygons up/down/west/east etc. I think this is not
necessary, and besides that "west" / "east" gives the impression of
spherical coordinate systems while this library is cartesian only.
Left/right should already be better, but why not two offset numbers in
two dimensions?
> - What is your evaluation of the documentation?
I didn't read all. It is looking good but contains several errors in the
details. E.g. documentation of the polygon_90_with_holes_concept,
function begin_holes has the description "Returns the begin iterator
over the range of coordinates that correspond to horizontal and vertical
edges.". This is probably copy/paste error. Another example is the "void
clean() const" function, which cannot be const of course, indicating
that this documentation is written / copied / pasted and not generated
with e.g. Doxygen.
> - What is your evaluation of the potential usefulness of the library?
I think most people using geometry / polygons are using floating point
coordinates. As these are not supported in the algorithms where this
library is specialized on (booleans), I think it is useful for a limited
public (VSLI?)
> - Did you try to use the library? With what compiler? Did you have any
> problems?
MSVC 2005, MSVC 2008, GCC 4.4 using MinGW. Everything compiled but the *
notation for intersections didn't compile anymore (it did before)
> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
An in-depth study to several aspects of the library
> - Are you knowledgeable about the problem domain?
I'm designing and implementing GIS / geometry libraries since 1992.
I hope that my review was as objective as can be expected from an
alternative library writer.
I conclude with the remark that I've asked several times on this list
and off-list for cooperation, I suggested that Luke could join us
writing a combined Geometry Library for Boost. All these requests were
never answered. I would still be in favour of one library.
If this library is accepted it will create a different situation for
subsequent geometry libraries. In reviews it will be expected that these
libraries work together or that one is built on top of the other. If
that is the case, the base should be very very good. It is to the other
reviewers to decide that this is the case.
Regards, Barend
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk