Subject: Re: [boost] Formal Review: Boost.Polygon starts today August 24, 2009
From: Barend Gehrels (barend_at_[hidden])
Date: 2009-09-01 19:13:30
Thanks for your explicit answers to my concerns.
Simonson, Lucanus J wrote:
> Barend Gehrels wrote:
>> 1) the library supports many operations, but some basic polygon
>> algorithms are missing:
>> - convex hull
>> - centroid
>> - is_convex
> A grounded argument against inclusion of Polygon in boost would be that the design makes it difficult or unnnatural to implement these
> alorithms to extend the library or as user code, which I don't believe is the case. Obviously the library cannot implement every algorithm, but it does provide a solid foundation for the building of algorithms it does not provide. Centroid, for example, can be implemented by using the Polygon library algorithm that subdivides a polygon into trapezoids and then accumulating the weighted centroids of the trapezoids.
Which will give a (probably very) bad performance.
>> - the "intersection" operation is too slow compared with other
>> libraries. The paper (referred to in the documentation) compares the
>> intersection performance with GPC and CGAL, but does not give exact
>> figures. It gives some plots about scalability using a very high
>> of points per polygon.
> >From the boost website:
> The interface specifications for boost.org library components (as well as for quality software in general) are conceptually separate from implementations of those interfaces. This may not be obvious, particularly when a component is implemented entirely within a header, but this separation of interface and implementation is always assumed. From the perspective of those concerned with software design, portability, and standardization, the interface is what is important, while the implementation is just a detail.
> Dietmar Kühl, one of the original boost.org contributors, comments "The main contribution is the interface, which is augmented with an implementation, proving that it is possible to implement the corresponding class and providing a free implementation."
> Performance is very important, but performance can be addressed over time without changing the interfaces. If there were an algorithm that was licensed under the boost software license that was 100% robust and faster than what I implemented in all cases I would happly accept a patch to swap out my implementation for that algorithm.
> The runtime measurements you have performed are largely the same benchmark repeated many times with minor variations.
Originally I had one intersection benchmark, but you suggested that I
should test scalability and environments with high intersection
density... So I did. There are indeed two types, the ellipse and the
star, tested both three times for scalability. The 10001 test is
probably ridiculous in the sense that it can take 20 hours for a
library, but I still think the benchmark is testing normal and heavy GIS
And users can use other datasets of course.
Sorry that I cannot test your 45-90 polygons, I've no use cases and not
> My line segment intersection algorithm is suboptimal. When there are a large number of edges that all overlap in their x ranges it degenerates to n^2 runtime when an optimal algorithm would scale closer to linear runtime. It is not hard to construct a benchmark to expose this weakness. My overal intersection algorithm also carries a high constant factor overhead. In your own benchmarking of my algorithm you will observe that it is sometimes faster and sometimes slower than geos, GPC and CGAL.
That is true. In the most heavy case it is faster than GPC, but it loses
a lot of precision than which worries me.
> Only your algorithm is faster across the board in your testing. If you are a factor of 10 faster than every other implementation that's great, but we don't know any of the details of your algorithm. What are its preconditions, what post conditions does it guarentee? Is it numerically robust? My understanding of your line segment intersection is that you snap to the floating point grid and ignore the potential to have self intersecting figures in the result. That is a somewhat important post-condition for this kind of algorithm. Particularly for things like, say, fracture analysis where the numerical solvers will reject such polygons at the input to meshing.
Thanks for your clarifications. The details of my algorithm will be
described separately and indeed I start with non-self-intersecting
polygons, giving a non-self-intersecting output.
> In a floating point number you have a certain number of bits of precision. [snipped]
This is also addressed by John, and I wanted to add that I cannot live
with this behaviour because in my tests, which reflects my normal usage,
GIS Map Overlay, I don't get the right results. If I take a higher
factor, it starts to overflow. If I take a lower factor, it is not
> Polygon is a refinement of polygon with holes, polygon45 is a refinement of polygon, polygon90 is a refinement of polygon45 and rectangle is a refinement of polygon90.
I would say: a polygon always have zero or more holes. A polygon45 is a
specialization of a polygon (overriding the algorithms where it makes
sense, e.g. performance-wise). A polygon90 is another specialization of
either a polygon or a polygon45.
A rectangle is a special case, having nothing to do with a polygon, this
makes things really complicated.
> [snipped some things which are answered]
>> I didn't try everything, but I'm getting the feeling that the concepts
>> are not implemented as they should be. The file
>> polygon_set_concept.hpp, which should give an outline of the polygon
>> set concept, is dependant on polygon_set_data.hpp (which implements
>> that concept). That should be the other way round. Inside that
>> concept file, points are often copied into
>> a polygon_set_data temporary. That is not how concepts should be
>> implemented, and is very bad for performance.
> I have to copy the polygon edges into something so that I can sort and scan them. That thing is encapsulated by the polygon_set_data class.
As far as I know concepts should not implement things. Concepts should
model things. Algorithms should take classes fulfilling the concepts. In
BP, inside the concept file a model of that concept is used to do
things. It sounds as inside-out.
> Our agreement at boostcon was that I would submit my library for review in a matter of months and that eventually when your library is reviewed it would be necessary to demonstrate interoperability. There is nothing about my library that would make it hard to demonstrate interoperability so I don't understand your concern.
We discussed interoperability at BoostCon and I was advised to take
Boost.Polygons booleans. Therefore I started those benchmarks and
included our (then suboptimal) intersection. Because of your paper and
presentation, everyone had good confidence in the performance. However,
I concluded other things and mailed this in June (off-list), giving the
message that we could not use Boost.Polygon's booleans because of
performance, because of floating point and because of compiler issues
(the last ones are solved now).
Another thing I wanted to test is if we could adapt our geometry types
to each others concepts. That test is not done, at least not by me, but
for purpose of this review I looked to your concepts. Not an in-depth
study. I conclude that it is, at least partly, implementation. So I
don't know how to continue with this. It is useless to give
Boost.Polygon my polygon and as soon as BP does things with it, the
whole thing is just converted. That is not how a geometry concept should
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk