# Geometry :

Subject: Re: [geometry] R-Tree Nearest Neighbors Unique Member From Group
Date: 2018-05-16 17:34:31

Hi,

Jpb Via GeometryÂ wrote:
> Hi Guys,
>
> I have an rtree where each item in the tree has a group ( Colors perhaps:
> red, green, blue, orange...)
> I want to do a nearest neighbor search where the search returns the nearest
> k-neighbors with at most
> one object from each group.
>
> For example, if we have
>
> Object A: Red, distance 1
> Object B: Red, distance 2
> Object C: Blue, distance 3
> Object D: Green, distance 4
>
> and I do a nearest neighbor at most one from each group search with k = 2
> then
> I get object A and C. B is skipped since A is closer of the same group.
>
> It looks like I could get this done by creating a specialization of
> distance_query_result class and have the distance_query_result::store method
> to check the groups/distances of already stored items and act accordingly.
> Then create a specialization of distance query and maybe query_dispatch to
> get distance_query to use the new distance_query_result.
>

No, there is no "official" (i.e. exposed in the interface) way of doing
it besides calling query N times for each group/category (using
nearest() && satisfies()) and after that merging the results. So your
approach is more optimal than that however it relies on the internals of
the R-tree so have in mind that they may be changed some time in the future.

I'd suggest keeping N stacks (neighbors) one for each group and track
them separately.
You also have to somehow pass the number of groups into the
distance_query_result.
Furthermore you have to implement
distance_query_result::greatest_comparable_distance() and return the
greatest distance from all groups but only if all heaps contain at least
one element each.

But I'm wondering, are the elements from all groups distributed more or
less in uniform fashion? I'm asking because otherwise the query may have
to traverse big chunks of the tree before finding at least one element
from each group. Consider the situation where there is a clear
separation between groups and groups are spatially far from each other.
In this case my guess is that it'd be better to have separate R-tree for
each group, perform knn query on all of them and merge the results.

An alternative to the solution above (for groups spatially far from each
other) would be to have one R-tree but to store in the nodes the
information about the groups of objects sotred in underlying nodes and
custom query visitor taking into account currently stored neaighbors and
groups covered by analysed nodes. So the idea is that the pruning of
nodes would be done both using the distance to the furthest neighbor
already found as well as groups of neighbors. This would not require to
merge the result at the end since the query woudl handle everything. But
it would require even more changes to the R-tree including introducing
new nodes (containing info about underlying groups), insertion visitor
(gathering the info about the groups from children and passing it to the
newly created node) and packing algorithm (if needed, the same as
insertion).

Regards,