Subject: Re: [geometry] R-Tree Nearest Neighbors Unique Member From Group
From: Adam Wulkiewicz (adam.wulkiewicz_at_[hidden])
Date: 2018-05-16 17:51:37
Adam WulkiewiczÂ wrote:
> Jpb Via GeometryÂ wrote:
>> Hi Guys,
>> I have an rtree where each item in the tree has a group ( Colors perhaps:
>> red, green, blue, orange...)
>> I want to do a nearest neighbor search where the search returns the nearest
>> k-neighbors with at most
>> one object from each group.
>> For example, if we have
>> Object A: Red, distance 1
>> Object B: Red, distance 2
>> Object C: Blue, distance 3
>> Object D: Green, distance 4
>> and I do a nearest neighbor at most one from each group search with k = 2
>> I get object A and C. B is skipped since A is closer of the same group.
>> It looks like I could get this done by creating a specialization of
>> distance_query_result class and have the distance_query_result::store method
>> to check the groups/distances of already stored items and act accordingly.
>> Then create a specialization of distance query and maybe query_dispatch to
>> get distance_query to use the new distance_query_result.
>> Is there a better way to go about this?
> No, there is no "official" (i.e. exposed in the interface) way of
> doing it besides calling query N times for each group/category (using
> nearest() && satisfies()) and after that merging the results. So your
> approach is more optimal than that however it relies on the internals
> of the R-tree so have in mind that they may be changed some time in
> the future.
> I'd suggest keeping N stacks (neighbors) one for each group and track
> them separately.
> You also have to somehow pass the number of groups into the
> Furthermore you have to implement
> distance_query_result::greatest_comparable_distance() and return the
> greatest distance from all groups but only if all heaps contain at
> least one element each.
> But I'm wondering, are the elements from all groups distributed more
> or less in uniform fashion? I'm asking because otherwise the query may
> have to traverse big chunks of the tree before finding at least one
> element from each group. Consider the situation where there is a clear
> separation between groups and groups are spatially far from each
> other. In this case my guess is that it'd be better to have separate
> R-tree for each group, perform knn query on all of them and merge the
> An alternative to the solution above (for groups spatially far from
> each other) would be to have one R-tree but to store in the nodes the
> information about the groups of objects sotred in underlying nodes and
> custom query visitor taking into account currently stored neaighbors
> and groups covered by analysed nodes. So the idea is that the pruning
> of nodes would be done both using the distance to the furthest
> neighbor already found as well as groups of neighbors. This would not
> require to merge the result at the end since the query woudl handle
> everything. But it would require even more changes to the R-tree
> including introducing new nodes (containing info about underlying
> groups), insertion visitor (gathering the info about the groups from
> children and passing it to the newly created node) and packing
> algorithm (if needed, the same as insertion).
I just realized that you want to get at most one element from each
group, not at least one element. So in this case you don't have to store
N stacks but only maybe some flags indicating elements of which groups
you already have.
However the lack of uniformity of distribution of various groups may
still affect the performance because effectively you want to have one
closest element from each group and among these pick k nearest ones. So
here again if the groups were far from each other you'd probably do
better if you had N R-trees one for each group and after N queries sort
the elements and pick k of them.
Keep in mind that all of this is only what my intuition tells me. If you
want to be sure you have to check it in practice.
Geometry list run by mateusz at loskot.net