Boost logo

Geometry :

Subject: Re: [geometry] Index distance predicates names
From: Adam Wulkiewicz (adam.wulkiewicz_at_[hidden])
Date: 2012-08-28 05:51:40


Barend Gehrels wrote:
>>
>> R-tree allows to store volumetric values (boxes) and find some number
>> of them nearest to some point. It is possible to define which point of
>> volumetric object is taken in distance calculation. It may be the
>> nearest point, the centroid or the furthest point. This corresponds to
>> functions bgi::near(), bgi::centroid() and bgi::far().
>
> Nice functionality, yes, this makes sense if you measure two polygons to
> a reference point (how does it work? calculate the furthest point on a
> polygon from a reference point, is that also indexed? done on the fly?).
>
> I find the name "centroid" not so intuitive in this context. You mean
> here: "with_respect_to_its_centroid", or "relative_to_its_centroid".
> Maybe "centroid_relative"? or "furthest_point_measured"? or simply "to",
> so "to_nearest"?
>

One remark, since r-tree operates on boxes it is furthest point on a box
to a query point.

>
>>
>> bgi stands for boost::geometry::index namespace.
>>
>> To complicate things. There is also a method rtree.nearest(Predicate)
>> and function bgi::nearest(rtree, Predicate).
>
> Are they both meant to be called by the library user? Within geometry,
> we normally use no methods on objects to be able to adapt other things.
> Maybe that rationale is not applicable to rtree, but are these methods
> synonyms?
>
>> So knn query may look like this:
>>
>> tree.nearest(bgi::near(pt), output);
>> bgi::nearest(tree, bgi::near(pt), output);
>> tree | bgi::filters::nearest_filtered(bgi::near(pt))
>>
>> or this:
>>
>> bgi::nearest(
>> tree,
>> bgi::bounded(
>> bgi::near(pt),
>> bgi::centroid(10),
>> bgi::far(20)
>> ), output);
>
> I'm failing to see what this exactly means. What is 10 doing in
> centroid, and 20 in far. Are those max. distances to the mentioned
> point? Or the 10 occurrences w.r.t. their centroids? I must admit I did
> not look up the docs.
>
> If we have a suffix, e.g. "_relative", the user is probably less
> confused about its meaning.

In addition to calculating distance from query point to a box in various
ways this function allows to return objects that are further or closer
than some distance in other words they are in a ring. Those bounding
distances may be calculated differently than the distance to the point.
The above example is the most complicated nearest query that user can
use. The query would return nearest Value which distance to the closest
point is the smallest but the distance from point to Value's centroid is
bigger than 10 and distance from point to Value's furthest point is
lesser than 20.

I see that this interface confuses you. Probably because it is something
you'll not see often in the rtree. ;)

Btw, the functions defining the ring have names: unbounded(),
min_bounded(), max_bounded() and bounded(). Their names refer to the
calculated distances.

Value passed to near(), centroid() and far() is query point in one case
and distance in other case. Maby we should distinguish between those
cases? Something like examples below.

bgi::nearest(tree,
   bgi::bounded(
     bgi::nearest_relative_to(pt),
     bgi::distance_to_centroid(10),
     bgi::distance_to_furthest(20)
   ),
   k, output
)

or

bgi::nearest(tree,
   bgi::bounded(
     bgi::smallest_distance_to(pt),
     bgi::distance_to_centroid(10),
     bgi::distance_to_furthest(20)
   ),
   k, output
)

or

bgi::nearest(tree,
   bgi::bounded(
     bgi::smallest_distance_to(pt),
     bgi::distance_to_centroid(10),
     bgi::longest_distance(20)
   ),
   k, output
)

or something else. I'd prefer something shorter ;)

>>
>> Some names may form non-intuitive query, e.g.:
>>
>> bgi::nearest(tree, bgi::nearest(pt), output);
>
> So this means: the closest geometries to "pt", where from each geometry
> the point closest to "pt" is measured.
>

This is how the query would look like if nearest() name was used instead
of near(). Now it looks like this:

bgi::nearest(tree, bgi::near(pt), output);

And yes, the closest box's point whould be taken into account.

>>
>> What is more, I'd rather avoid furthest() because it is nice antonym
>> of nearest() and may be used as similar function name.
>>
>> To the point. Possible names are:
>>
>> near -> nearest, close, closest
>> far -> furthest, distant, most_distant
>>
>> Do you have any preferences or other ideas?
>
> Nearest is commonly used in "nearest neighbor"
> Oracle uses "sdo_nn" but without sdo this is probably too much an
> abbrevation
>
> I think the most distant points are normally termed "farthest", see e.g.
> http://dl.acm.org/citation.cfm?id=142740
>
> but furthest is also widely used.

Ok, so I think we shouldn't use those names to define distances.

Regards,
Adam


Geometry list run by mateusz at loskot.net