Hi Volker,

Volker Schöch wrote On 17-11-2014 14:12:

Hi,

 

I’m still struggling with your notion of a „valid“ polygon: E.g., a polygon that contains spikes does not satisfy the is_valid(…) predicate. Consequently, this seems to imply that spikes do not carry any information at all and can be removed without harm. Of course, this may depend on the application domain, but for the purpose of this discussion, let’s stick with the notion used for boost::geometry. I’d just like to see your explicit confirmation, or have my understanding corrected.

 

Corollary: If for some multi-polygon P, is_valid(remove_spikes(P)) holds, then for all purposes of boost::geometry, remove_spikes(P) is equivalent to P.


Seeing also your ticket - no, this does not hold. remove_spikes removes the spikes in polygons. That does not necessarily result in a valid (multi)polygon. The geometry may still be invalid, but due to other invalidities.

E.g. your sample (in the ticket) with two points, where a spike is removed, results in a polygon with only one point. That is IMO a perfect result from remove_spikes. Because the spike is removed. remove_spikes should do nothing more than that... Why should it remove a remaining point - that point is not a spike. And remove_spikes should not change anything but spikes...

That the result is still not as desired, OK, I understand that but... that should be implemented in another function (e.g. correct).

 

Corollary: For all purposes of boot::geometry, all polygons that only consist of spikes are equivalent to each other and are equivalent to the empty multi-polygon.


See above and also - they are not spatially equal. If we examine spatial equalness, we don't check validity. If one contains a spike and another does not - they are not spatially equal because their boundaries do not cover the same spatial space.

 

Let’s once again look at the difference(…) algorithm, and let’s assume the following two geometries, which satisfy the is_valid(…) predicate. Note that _intPolygon is a model for boost::geometry::multi_polygon, and _intRect is a model for boost::geometry::box. _intPolygon is oriented couter-clockwise and not closed, points are based on int.

 

_intPolygon polygonA;

boost::geometry::read_wkt("MULTIPOLYGON(((1 1,1000 2,2 2,1000 3,1 3,1 1)))", polygonA);

_ASSERT( boost::geometry::is_valid(polygonA) ); // holds

 

_intRect rectB;

boost::geometry::read_wkt("BOX(1 1,900 4)", rectB);

_ASSERT( boost::geometry::is_valid(rectB) ); // holds

 

_intPolygon polygonC;

boost::geometry::difference(polygonA, rectB, polygonC);

 

The resulting multi-polygon polygonC is the empty multi-polygon. This is consistent with the above: The part of the polygon, that is left after the difference operation, is mathematically non-empty, but in its int representation it is a spike, and thus is removed entirely.


OK, yes we now remove spikes in spatial intersection operations. Because they make the output invalid. I think that should be done.

 

On the other hand, the result seems to be at odds with the fact that boost::geometry::covered_by(polygonA, rectB) returns false. (Note: As of 1.57.0, covered_by is not implemented for polygon and box, so for this test I converted the box to a polygon. I assume that this doesn’t affect the argument of this discussion.) As far as I understand, that’s “by design” in boost::geometry.

 

Is my understanding correct so far?


We should come back to that later. Indeed - if the output would have been a spike which is then removed, the results might be mutually inconsistent... I see what you mean.

 

Now, let’s try to apply this to the problem I have to solve in my application. Let’s consider a the degenerated box that we already know from the discussion about “access violation”.

 

_intRect rectB;

boost::geometry::read_wkt("BOX(417 2064,597 2064)", rectB);

 

This box does not satisfy the is_valid(…) predicate. Let’s still convert it to a multi-polygon:

 

_intPolygon polygonB;

boost::geometry::convert(rectB, polygonB);

 

The resulting polygon is MULTIPOLYGON(((417 2064,597 2064,597 2064,417 2064))). The polygon does not satisfy is_valid(…) neither.

Question: Shouldn’t the convert operation actually yield the empty multi-polygon?


No - convert should convert a rectangle (any rectangle) in another geometry, and does not necessarily check for is_valid or make the result valid. So IMO this is correct behaviour.

 

Anyway, let’s try to fix what we have:

 

boost::geometry::remove_spikes(polygonB);

 

The resulting polygon is MULTIPOLYGON(((417 2064))). Still, the polygon does not satisfy is_valid(…).

Again, shouldn’t the remove_spikes operation actually yield the empty multi-polygon?


No

 

If this is not the way to turn a polygon with spikes into valid input for boost::geometry algorithms, then how should I do it instead?


That is a good question. Our dissolve algorithm (still in extensions) should make geometries valid. If they should remove spikes then - probably, that is not yet considered when it was designed. dissolve removes self-intersections from a potentially multi-self-intersecting geometry by trying to collect all clockwise (or in your case ccw) areas.

 

Is boost::geometry::area(polygonB)==0 a proper way to determine if a polygon with spikes is properly represented by the empty multi-polygon? This seems dubious to me: The result of area(…) will be flawed with rounding error and may or may not be accurately 0. Thus, how do I properly determine if a given multi-polygon that potentially contains spikes is equivalent to the empty multi-polygon?

 

Given all of the above, it seems pretty natural to use boost::geometry’s own equals(…) algorithm for this purpose: “The free function equals checks if the first geometry is spatially equal the second geometry. Spatially equal means that the same point set is included. […]”

http://www.boost.org/doc/libs/1_56_0/libs/geometry/doc/html/geometry/reference/algorithms/equals.html

 

Unfortunately, boost::geometry::equals(MULTIPOLYGON(((417 2064,597 2064,597 2064,417 2064))), MULTIPOLYGON()) yields false. Why is that? Does the equals(…) algorithm require that the input does not contain spikes? I don’t see this in the documentation. Is this the expected behavior of this algorithm? If so, what would be a correct way of determining if a polygon is a spike-only polygon?


In general, none of our functions check validity beforehand. That is mainly because of performance reasons: a validity check is quite time-consuming, so it is left to the user to supply valid input, or check validity beforehand. So area, equals, difference, and all the functions you name, they are all not checking if the polygon is valid. At least not beforehand (sometimes an operation fails because of invalidity - then an exception can be thrown).

I understand your problem of course, because our valid-check is not yet perfect and also (seeing your tickets) some of the overlay operations seem still to result in invalid output (that should be fixed of course).

We'll try to work on this. I hope the approach is a bit more clear now.

Regards, Barend