Boost logo

Geometry :

Subject: Re: [geometry] Are all query iterators const?
From: Adam Wulkiewicz (adam.wulkiewicz_at_[hidden])
Date: 2015-09-09 10:07:31


Hi Hal,

Hal Clark wrote:
<snip>
> Thanks for the clear, informative response Adam. I'm primarily
> interested in attaching arbitrary data to the point class in order to
> also cluster non-spatial attributes. For example, using GDBSCAN
> (http://link.springer.com/article/10.1023%2FA%3A1009745219419). The
> crux of my issue is that the R*-tree may or may not need to index the
> non-spatial attributes in addition to spatial coordinates. I'm still
> toying with the idea of making these de-facto spatial ("virtual" or
> "phony") coordinates, but some clustering algorithms require a clear
> distinction. Also, defining a custom distance metric seems a bit
> involved.
>
> Your suggestion to make a mapping from some token to external data is
> probably the most reliable means of including attributes, but would
> result in lots of cache misses and make it hard to index attributes.
> At the moment I'm simply giving the user a template to decide whether
> they want to pay for fat attribute members.

If non-spatial data is used in the searching process then of course it
could be better to store it inside the Point. Otherwise it'd probably be
better to get rid of the cold data. It's hard to tell which would be
better without profiling. And I'm guessing the performance may depend on
the actual types of points so leaving the decision to the user may be
the best solution.

> You mentioned std::multiset which means duplicates are accepted and
> indexed properly. This is great! Most R*-trees I've seen do not do
> this. Is this a property that is guaranteed into the future? Other
> than a post you made in 2014 for Boost 1.55
> (http://stackoverflow.com/questions/22920055/boostgeometryrtreecontains-return-multiple-identical-results),
> I do not see any claims in the documentation.

Yes, this is how the R-tree works right now and breaking changes
wouldn't be acceptible in this case. If a std::set-like-rtree was needed
we should probably implement a separate index.

>
> Also, not that my implementation is suitable at the moment, but are
> you accepting pull requests for clustering algorithms? I see a
> suspicious lack of them...

Sure, any help is appreciated. Indeed there are many algorithms which
could be good additions to the library. So don't hesitate to propose
anything. It was more or less like this with the rtree.

A'propos clustering algorithms, some time ago I planned to add a
balancing algorithm based on k-means but I haven't found enough time:
- Revisiting R-tree Construction Principles
(http://pyrtree.googlecode.com/hg-history/f81b9c8157117a1f97cdcc3341ae9a061f5df4fa/doc/ref/r-tree-clustering-split-algo.pdf)

Regards,
Adam


Geometry list run by mateusz at loskot.net