Subject: Re: [geometry] Reliable invariants of geometry representation? Validation tests?
From: Barend Gehrels (barend_at_[hidden])
Date: 2012-02-10 10:27:15
On 10-2-2012 11:49, Volker Schöch wrote:
> Hi Barend,
>> The input should be valid. Valid means valid w.r.t. OGC requirements: a so-called simple (multi)polygon, not self-intersecting, self-tangencies are allowed, correct orientation of exterior ring, and interior rings, correct closure.
> Obviously, there is some broader-consensus definition of a valid polygon representation you are referring to. I was quickly searching the http://www.opengeospatial.org website, but only found the "Glossary of Terms" which mentions polygons but does not offer a very formal definition. I would appreciate it if you could point me to an authoritative source; and if such a webpage conventiently exists, may I suggest to add that reference to the documentation, too. Such a formal definition could probably answer many of my questions without the need to bother the mailing list.
The rules can the best looked up in this OGC document (Simple Features -
That document (which is a perfect companion to our library) also refers
to many other sources.
You will see that a polygon is defined in a "counter clockwise
direction". So you are right with that respect.
There are more rules defined there (topological closed, etc).
I will include a link to the documentation (but there are new versions
there also now and then). Anyway we will also mention the "simple
features" there (that was somewhere in the past but is slipped out).
Note that we follow the ideas of the OGC, and most of the times
literally, but sometimes deviate from it. For example: "AsText" is in
our library the "wkt" stream manipulator.
See also this (outdated!) page:
which will be moved to the docs as well (but not in this version).
>>> And by the way, what's the reason why you chose clockwise as your default orientation when the convention is to view a counter-clockwise oriented polygon as filled on the inside? Is this computer science way of thinking with the origin of the coordinate system in the upper left corner of the screen? ;-)
>> It has always been. No, it is not a computer science way - I have a GIS origin.
> For instance, I checked Wikipedia and, more authoritative, Wolfram's Mathworld:
Sorry - when I said "it has always been", I meant: "it has always been
in our library". So from 1995 on. Of course the moment of redesign would
have been a good moment to change it.
But it was/is not so clear what the convention in the GIS world is.
Buffer a linestring in PostGIS and you will get a clockwise polygon
back. Buffer a linestring in SQL Server and the result is a counter
Besides that the origin defines orientation too. Have a clockwise
polygon defined in a mathematical coordinate grid displayed on a
computer screen and it is upside down and counter clockwise.
So we kept our default convention.
In the review it was clear that some people want clockwise, others
(probably more indeed) want counter clockwise. Our library was accepted
that but one of the conditions was that it would be orientation (and
closure) agnostic. So it is.
The default is not that important. We just provide a "model" of a
polygon. But people can provide their own models as well. And even in
our polygon it is really easy to make it ccw, as you have seen.
> "Note that the area of a convex polygon is defined to be positive if the points are arranged in a counterclockwise order, and negative if they are in clockwise order (Beyer 1987)." (from Mathworld)
> That's why our polygons have counter-clockwise orientation, just to give you some background. I assume that using GGL with ClockWise=false does not have any implications, performance- or otherwise?
The geometries are traversed in reverse. So they should supply a
bidirectional iterator. I've measured performance and there are no real
differences. This with the normal (std::) iterator set, it might be
imagined that other iterators would differ more.
Besides the combinatorial explosion we already have (geometry types,
point types, dimensions) this gives various extra combinations. There
are many tests in the unit test which test both orientations, and
open/closed. I have to inventarise again if it is complete with that
Another option is that one of the input is clockwise, the other is
counter clockwise, and the output is open :-(. That combination is not
yet supported. The three can (theoretically) also have other point types
(e.g. tuples, int/double, etc).
>>> [...] why did you choose to require closed polygons having begin==end?
>> It is not required. This is optional either. Opened/closed. Most (probably all) algorithms are checked on this.
> Huh. Can you elaborate? Correct me if I'm wrong: Are you saying that begin==end is "optional", i.e., that all algorithms should return the same results, with the same big-O performance, whether or not begin==end, assuming that the polygon type is templated as Closed=true?
Yes. We have one iterator type (closing_iterator / closeable_view) which
closes the polygon. All algorithms assume closed polygons. But the input
can be open and it is then closed automatically by the iterator. This is
all compile time. There is no "if"...
This is also the way orientation works.
> That would be greatly appreciated. We would rather have our closed polygons not carry that additional appendix-point. However, I was under the impression that leaving out that point returned a different area for the otherwise same polygon. So I am really not yet clear about this.
So it is possible, and where it deviates, it is a bug.
> Because this has not been explicitly answered in your last email, let me ask again: What is the area of a polygon that does not have begin==end, if explicitly using a closed polygon model (specified by way of template parameter)? Is the result well-defined?
No, it is not. The area will be wrong then.
>> All this is not checked before (reason: performance). [...] There will be an is_valid algorithm.
> These two form a really strong team. Much better than the has_self_intersections hack. I think your paradigm "shoot first ask questions later" is great for the purpose of a geometry library. Leave responsibility for valid input to the caller (but, obviously, the reponsibility to create valid output from any valid input stays with the library). I'd like to encourage that approach, and obviously it requires some tools for the caller to check for validity of input beforehand. Stick with your paradigm, provide us with some tools to check our input, and ditch has_self_intersections and similar ugly ducklings. And get rid of any (all? most?) unsuspectingly throwing exceptions along the way. Invalid input results in invalid (undefined) output, that's fair enough as long as it is QUICK. :-)
Thanks for your opinion!
Geometry list run by mateusz at loskot.net