Subject: Re: [boost] Formal Review: Boost.Polygon starts today August 24, 2009
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2009-09-01 13:49:03
Barend Gehrels wrote:
> 1) the library supports many operations, but some basic polygon
> algorithms are missing:
> - convex hull
> - centroid
> - is_convex
A grounded argument against inclusion of Polygon in boost would be that the design makes it difficult or unnnatural to implement these
alorithms to extend the library or as user code, which I don't believe is the case. Obviously the library cannot implement every algorithm, but it does provide a solid foundation for the building of algorithms it does not provide. Centroid, for example, can be implemented by using the Polygon library algorithm that subdivides a polygon into trapezoids and then accumulating the weighted centroids of the trapezoids.
> - the "intersection" operation is too slow compared with other
> libraries. The paper (referred to in the documentation) compares the
> intersection performance with GPC and CGAL, but does not give exact
> figures. It gives some plots about scalability using a very high
> of points per polygon.
>From the boost website:
The interface specifications for boost.org library components (as well as for quality software in general) are conceptually separate from implementations of those interfaces. This may not be obvious, particularly when a component is implemented entirely within a header, but this separation of interface and implementation is always assumed. From the perspective of those concerned with software design, portability, and standardization, the interface is what is important, while the implementation is just a detail.
Dietmar Kühl, one of the original boost.org contributors, comments "The main contribution is the interface, which is augmented with an implementation, proving that it is possible to implement the corresponding class and providing a free implementation."
Performance is very important, but performance can be addressed over time without changing the interfaces. If there were an algorithm that was licensed under the boost software license that was 100% robust and faster than what I implemented in all cases I would happly accept a patch to swap out my implementation for that algorithm.
The runtime measurements you have performed are largely the same benchmark repeated many times with minor variations. My line segment intersection algorithm is suboptimal. When there are a large number of edges that all overlap in their x ranges it degenerates to n^2 runtime when an optimal algorithm would scale closer to linear runtime. It is not hard to construct a benchmark to expose this weakness. My overal intersection algorithm also carries a high constant factor overhead. In your own benchmarking of my algorithm you will observe that it is sometimes faster and sometimes slower than geos, GPC and CGAL. Only your algorithm is faster across the board in your testing. If you are a factor of 10 faster than every other implementation that's great, but we don't know any of the details of your algorithm. What are its preconditions, what post conditions does it guarentee? Is it numerically robust? My understanding of your line segment intersection is that you snap to the floating point grid and ignore the potential to have self intersecting figures in the result. That is a somewhat important post-condition for this kind of algorithm. Particularly for things like, say, fracture analysis where the numerical solvers will reject such polygons at the input to meshing.
> 3) Boost.Polygon supports integer but does not support double for
> boolean operations.
> For the operations "area" and "contains", coordinates in "double"
> floating precision are supported. Also the boolean operations accept
> double coordinates, but crashes immediately. I think it should not
> crash, slightly better is to prevent floating point coordinates by
> giving a compilation error. Much better: floating point coordinates
> should be implemented. For me as GIS-user, a library without floating
> point support is unusable.
In a floating point number you have a certain number of bits of precision. An integer type can be chosen with equal or greater precision and the input translated to center on the origin and scaled by a power of 2 to use the max number of bits when converted to integer. The intersection operation is then performed with at most one bit of snapping and rounding loss of precision and then the result can be scaled back to floating point and translated back to its original position. With this scheme you get precision exactly comparable to what floating point provides. You may not be able to use equality compare on the floating point values from before and after, but you shouldn't be doing that with floating point numbers in the first place. I can detect floating point coordinates at compile time and insert this scaling such that it is transparent to the user of the library. This fixed point algorithm really is equivalent to floating point.
>> - What is your evaluation of the design?
> The diagram shows the polygon_90 as a refinement to a rectangle, the
> polygon_45 as a refinement to a polygon_90, the polygon as a
> to a polygon_90, and a polygon_45_with_holes as a refinement to both a
> polgon_90_with_holes and a polygon_45. All this makes really no sense
> me, I think it could and should be much more simple.
I believe Steven Watanabe already pointed out that the arrows in the diagram are the reverse of what is conventional. This is a fix that needs to be made to the documentation.
What is implemented in the library is the correct refinement relationship. I recall Dave Abrahams posted an ascii diagram of what he thought the refinement relationships should be:
> +-- p <------+
> / \
> pwh <--+ +-- p45 <---+
> \ / \
> +-- p45wh <--+ +-- p90
> \ /
> +-- p90wh <--+
Polygon is a refinement of polygon with holes, polygon45 is a refinement of polygon, polygon90 is a refinement of polygon45 and rectangle is a refinement of polygon90.
>> - What is your evaluation of the implementation?
> It is curious that Boost.Polygon went from Tag Dispatching to SFINAE
> last year, while we (GGL) went from SFINAE to Tag Dispatching. We're
> very glad with our change, while most compiler problems and warnings
> in Boost.Polygon origin from the SFINAE implementation. I'm still
> that Tag Dispatching, suggested to us on this list by Dave Abrahams,
> a very good choice in a geometry library implementation. We're happy
> with it and all compiler and compatability problems are vanished.
SFINAE is necessary for generic operators and good to have for generic functions with very generic names such as "area" and "distance". SFINAE is not a bad thing in a boost library. It turns out thate once compiler issues are understood they can be overcome.
> Things are copied from other boost libraries. File isotropy.hpp
> a nearly literal copy of enable_if. There is also an extraction of
> While this is an explicit choice, I don't think it is a good choice.
> The Boost Concept Check Library is not used to check concepts.
These things copied from boost source code are inside #ifdef BOOST_POLYGON_NO_DEPS and are to facilitate compiling and using the library in stand alone mode. This is necessary for internal use of the library and I didn't want to fork the code to submit it to boost.
> I didn't try everything, but I'm getting the feeling that the concepts
> are not implemented as they should be. The file
> polygon_set_concept.hpp, which should give an outline of the polygon
> set concept, is dependant on polygon_set_data.hpp (which implements
> that concept). That should be the other way round. Inside that
> concept file, points are often copied into
> a polygon_set_data temporary. That is not how concepts should be
> implemented, and is very bad for performance.
I have to copy the polygon edges into something so that I can sort and scan them. That thing is encapsulated by the polygon_set_data class.
> I conclude with the remark that I've asked several times on this list
> and off-list for cooperation, I suggested that Luke could join us
> writing a combined Geometry Library for Boost. All these requests were
> never answered. I would still be in favour of one library.
> If this library is accepted it will create a different situation for
> subsequent geometry libraries. In reviews it will be expected that
> these libraries work together or that one is built on top of the
> other. If
> that is the case, the base should be very very good. It is to the
> other reviewers to decide that this is the case.
Our agreement at boostcon was that I would submit my library for review in a matter of months and that eventually when your library is reviewed it would be necessary to demonstrate interoperability. There is nothing about my library that would make it hard to demonstrate interoperability so I don't understand your concern.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk