Boost logo

Boost :

Subject: Re: [boost] [gsoc19] Seeking a mentor for Boost.Geometry project related to triangulation and random sampling
From: Tinko Bartels (tinkobartels_at_[hidden])
Date: 2019-03-19 11:07:35


Hi Adam,

thank your for your extensive feedback. I'll try to address all questions.

> One projects can be a lot of work and you propose to work on 2 at the
> same time. Are you certain you'd like to do it? Or is it maybe
> because both projects are somehow related, i.e. one is needed by the
> other?

There is a relation in the sense that I think that a fast (not necessarily
Delaunay) triangulation algorithm and a triangle concept are beneficial for
a random point generator that generates points uniformly distributed inside
a polygon (and this is probably true for higher dimensions as well). For a
large number of points, I found that directly sampling to triangles
outperforms a rejections sampler.

Other than that, my proposal to combine two projects is based on a
conservative estimate of the total amount of work. It took roughly a day to
implement an initial, imperfect variant of s-hull (for the programming
competency test) and the triangulation algorithms commonly described in
literature seem to be intuitively accessible enough to be implemented
properly in a reasonable timeframe. The random samplers are, in my opinion,
a simpler problem than the triangulation, so they can be completed afterward
but still within the GSoC. I am certain that I can produce a good quality
implementation with proper tests and documentation of the proposed features
in the GSoC work period with reasonable effort.

> What coordinate systems are you planning to support? Cartesian,
> geographic, etc. In case you decided to implement algorithm for one
> coordinate system, would the same algortihm theoretically work with
> different ones or would different algorithm be required?

Regarding the random generators: Generally, uniform sampling can be done in
geographic coordinates using an equal-area transformation from the cylinder
to the sphere, and I have considered that in the proposal (last paragraph
before milestones).

However, I imagine that in GIS applications one is usually interested in
very small subsets of the sphere so rejection_sampling would be
prohibitively expensive. In the programming competency test, I implemented a
more efficient sampler for convex polygons by decomposing it into triangles
and sample uniformly on triangles.

I suppose the most obvious approaches would be to do the same thing for
polygons on spheres (I've found a straightforward way to do that in
literature: https://www.graphics.cornell.edu/pubs/1995/Arv95c.pdf ) or to at
least chose some box or polygon in cartesian coordinates so that its image
(under an equal area transformation) covers the interesting area of the
sphere and not too much more, take uniform samples in it, map them to the
sphere, and reject them if necessary.

I'd be thankful for input from potential users of randomly generated points
in geographic (or other) coordinate systems if there is more to consider.
More on randomly generated points below.

Regarding triangulation: In my literature research for the proposal, I
concentrated on Delaunay triangulations in cartesian coordinates. I've
looked into the spherical case and there is a lot of literature for that as
well, for example, this paper on adapting Fortune's algorithm to spheres (
http://www.e-lc.org/tmp/Xiaoyu__Zheng_2011_12_05_14_35_11.pdf ). I've seen
very intuitive approaches to that using complex hull algorithms and
projections to the sphere. I don't know how well and with how much effort
the algorithm I've linked in my proposal translates to triangulations on
spheres but I will think about it.

Do you know how much interest there would be in Delaunay triangulations on
spheres and what the usual use case would be? If someone enters 3 points
does he want a decomposition of the sphere into 4 triangles or would one
usually be interested in the triangulation of a small subset of the sphere?

> 2. Triangulation
>
> So triangulation is basically a triangle mesh represented by arrays
> of vertices and indexes, correct? If that's the case then maybe we
> need more general mesh concept?

I suppose it could be generalized. Edges and vertices would remain the same
thing but faces could be arbitrary rings with more than three adjacent
faces.

> If the result of triangulation is a mesh then why do you need
> triangle concept as 3-point ring? Wouldn't a view/proxy used
> internally in your algorithm and storing iterators/pointers to
> vertices in the mesh be enough?

I imagine triangles are useful primitives in general and it would be
beneficial to have the concept to specialize some existing algorithms for
them (like area) so that their implementation can make use of the additional
constraint of the triangle being a ring with only three points. I agree that
internally in the algorithms for Delaunay triangulation, triangles would be
represented by views with iterators to vertices but I thought that would not
make a difference because those views and triangles that actually store
vertices could both fulfill the same concept.

> You mentioned Boost.Graph. Would your algorithm depend on this
> library? If possible we should avoid dependencies. If it doesn't
> depend then we could still have concepts compatible with other
> libraries so the same types can be easily adapted to work with both.

I would not depend on it because the algorithm linked in the proposal (as
far as I can see) does not use any complicated graph operations. It would be
nice to be compatible with BGL because I've already had cases in which I had
to perform more complicated graph operations on meshes.

> left-of-predicate in Boost.Geometry is called "side strategy" and is
> implemented for all supported coordinate systems.

I completely overlooked that one. I see that it is designed to optionally
supports higher precision coordinate types. I will look further into that.

> in-circle-predicate however is not. It also looks like more complex
> operation since you have to find a center of the circle and then
> check if a point is inside it. In Boost.Geometry this would be
> divided into centroid() and distance(). So my guess is that the in-
> circle-predicate should be divided into more basic building blocks.
> However I'm not sure if the center of the circle is the same as
> barycenter in non-cartesian coordinate systems. Furthermore
> Boost.Geometry doesn't have non-cartesian centroid yet. Even if you
> do not support geographic triangulation right now we should do
> theoretical work and prepare the interface for future extensions,
> i.e. know what basic coordinate-system-dependant operations will be
> needed in the future.

I implemented it in that two-stage way in my implementation of Delaunay
Triangulation for the programming competency test. However, I found that
this seems to be usually done in a single step (see
https://www.cs.cmu.edu/~quake/robust.html ), I guess for performance and
stability reasons. I noticed that with larger problem sizes (1'000'000
points and more) my Delaunay triangulation would not terminate for some runs
(I generate the problems randomly) and some experiments (I did not look too
deeply into it) suggest that my two-staged, non-robust in-circle-predicate
is to blame here.

I would trust the literature there and implement a simple approximate
predicate that works for non-degenerate problems and an adaptive predicate
that will be a little slower but still work for degenerate cases.

> In case you wanted to work on centroid too, geographic centroid could
> be yet another whole project for GSoC, we were actually considering
> to add it. ;)

I am afraid that would lead to overstretching my proposal. ;)

> 3. Randomization
>
> Just an idea about the generalization of the interface. I can imagine
> that someone may want to generate points in the interior of a polygon
> and someone else may want points in the interior or on the boundary,
> etc. With current design you'd have to add more function names for
> various conbinations, correct?

I would not create special generators for all conceivable cases. But I think
handling some special cases is very benefecial. The one I produced for the
programming competency test for project 4 uses a trivial fan triangulation
and uses that to produce random samples directly rather than sampling from
the bounding box and rejecting based on the within-predicate.

I did not commit it but I did write a simple rejection sampler as well and
found it to be much slower (I think about 2-3 times), even though the convex
polygons are comparatively simple geometries to check within for and they
cover a large portion of their bounding box so only a good portion of
samples would not be rejected.

> AFAIU the rejection_generator would be a special case of a generator
> capable to take spatial predicates defining returned points. In this
> case returning points that are let's say within(A) && ! covered_by(B)
> (so in the interior of A and in the exterior of B). So another
> possibility is to have a generator taking general spatial predicates.
> Furthermore, the above (points in interior/boundary/exterior) could
> also be achieved with this approach.

I totally agree with the suggestion that arbitrary spatial predicates should
be supported for rejection sampling.

> Note also that performing randomization for arbitrary geometries
> would be supported with this interface, also e.g. N random points
> within multipoint.

I agree but the probability of a uniformly distributed point lying inside a
multipoint (except for a very coarse space) is vanishingly small. I think
that point generation should at least be specialized by the type of domain
(discrete, lines, volumes).

> It is also possible to decide to go this way but do not implement all
> elements right now. In such case we could think about the interface
> and be prepared for extensions in the future. E.g. for now support
> only something like this:
>
> namespace bg = boost::geometry;
> bg::rng::generate(bg::rng::within(A), output);

I like that interface. I would maybe drop the rng, because it's conceivable
to have generators in the future for deterministic point generation as well,
for example to create some regular lattice or nodes on edges.

Kind regards,
Tinko Bartels

--
Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk