Boost logo

Boost :

Subject: Re: [boost] [gsoc19] Seeking a mentor for Boost.Geometry project related to triangulation and random sampling
From: Adam Wulkiewicz (adam.wulkiewicz_at_[hidden])
Date: 2019-03-19 15:20:28


Tinko Bartels Via Boost wrote:
>
> I completely overlooked that one. I see that it is designed to optionally
> supports higher precision coordinate types. I will look further into that.

The whole library supports user-defined calculation type.

>> in-circle-predicate however is not. It also looks like more complex
>> operation since you have to find a center of the circle and then
>> check if a point is inside it. In Boost.Geometry this would be
>> divided into centroid() and distance(). So my guess is that the in-
>> circle-predicate should be divided into more basic building blocks.
>> However I'm not sure if the center of the circle is the same as
>> barycenter in non-cartesian coordinate systems. Furthermore
>> Boost.Geometry doesn't have non-cartesian centroid yet. Even if you
>> do not support geographic triangulation right now we should do
>> theoretical work and prepare the interface for future extensions,
>> i.e. know what basic coordinate-system-dependant operations will be
>> needed in the future.
> I implemented it in that two-stage way in my implementation of Delaunay
> Triangulation for the programming competency test. However, I found that
> this seems to be usually done in a single step (see
> https://www.cs.cmu.edu/~quake/robust.html ), I guess for performance and
> stability reasons. I noticed that with larger problem sizes (1'000'000
> points and more) my Delaunay triangulation would not terminate for some runs
> (I generate the problems randomly) and some experiments (I did not look too
> deeply into it) suggest that my two-staged, non-robust in-circle-predicate
> is to blame here.
>
> I would trust the literature there and implement a simple approximate
> predicate that works for non-degenerate problems and an adaptive predicate
> that will be a little slower but still work for degenerate cases.

Vissarion pointed out in other email that I made a mistake with centroid
being the center of the circle.

Sorry about that. The two steps would be envelope(triangle,
sphere_on_surface) and covered_by(point, sphere_on_surface). I wrote
sphere_on_surface because we also have nsphere (n-Dimensional sphere) in
the extensions and AFAIU this would be something different. In
particular in geographic the shape of the sphere on the surface of the
ellipsoid would be different and we wouldn't be able to represent it as
a center point and radius.

He also seems to agree with you that this should be done in one step. I
have nothing against it if it's proven to be faster. My main concern was
bviously the reusability of the code and the complexity of interface.
1-stage operation would require new type of strategy but that's ok.

>
>> Note also that performing randomization for arbitrary geometries
>> would be supported with this interface, also e.g. N random points
>> within multipoint.
> I agree but the probability of a uniformly distributed point lying inside a
> multipoint (except for a very coarse space) is vanishingly small. I think
> that point generation should at least be specialized by the type of domain
> (discrete, lines, volumes).

If the interface allowed it and predicates were passed they would define
in which topological parts the randomized points would be. So if one
wanted to get points within multipoint like this:

generate(N, within(multipoint), output)

the algorithm would return N points from a multipoint (in the interior
of multipoint, see: https://en.wikipedia.org/wiki/DE-9IM). Note that it
doesn't mean it would have to be rejection sampler because we can
imagine a case when the predicate expression is analysed, new geometry
is generated from it and then sampling is done in this geometry, e.g.:

generate(N, within(polyA) && ! within(polyB), output)

could first calculate

difference(polyA, polyB, polyC)

and then generate points that are in polyC (with special handling of
boundaries because the boundary of polyB would be allowed).

Not all predicates could be allowed, we could limit which kinds of
geometries and predicates can be passed together, etc.

>
>> It is also possible to decide to go this way but do not implement all
>> elements right now. In such case we could think about the interface
>> and be prepared for extensions in the future. E.g. for now support
>> only something like this:
>>
>> namespace bg = boost::geometry;
>> bg::rng::generate(bg::rng::within(A), output);
> I like that interface. I would maybe drop the rng, because it's conceivable
> to have generators in the future for deterministic point generation as well,
> for example to create some regular lattice or nodes on edges.

The problem is the name "within" which already exists in boost::geometry
namespace as relational operation and also in boost::geometry::index
namespace as rtree spatial predicate. So we'd have to find a different
name or put it in a different namespace.

One more general comment at the end related to all projects and
non-cartesian coordinate systems. In general the approach in the library
we perform calculations directly in the coordinate systems of the data
or strategy. So we do not transform between them in the process. If the
user wants to do transform e.g. geographic to cartesian, perform an
operation and then transform back he can do it. Space reference systems
transformations are implemented but not documented. See:

https://github.com/boostorg/geometry/tree/develop/include/boost/geometry/srs
https://github.com/boostorg/geometry/tree/develop/test/srs

The problem with transformations is that they are contextual, i.e. there
is no one transformation with standard set of parameters which would
work for all possible geographic data. It depends on the use-case and
should be choosen by the user. AFAIR you mentioned earlier projecting
from sphere to cylinder. The greater the polygon/triangle the more
distorted the coordinates would be. This would also be true for the
distribution of the random points, it'd no longer be uniform. So we'd
have to divide the problem into smaller triangles, perform separate
projections for each of them, calculate in cartesian and the transform
back the result. And its is still not clear what should be the accuracy
of the operation i.e. size of the triangles and therefore their number
(accuracy-performance tradeoff). My (not informed) guess is that this
also depends on the use case so we end up at the user deciding about it
anyway.

Adam


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk