Subject: Re: [boost] Geometry and spatial indexes, my opinion
From: Patrick Mihelich (patrick.mihelich_at_[hidden])
Date: 2008-10-09 07:23:01
On Thu, Oct 9, 2008 at 1:48 AM, Bruno Lalande <bruno.lalande_at_[hidden]>wrote:
> Stop me if I'm wrong but what you're discussing now is how to
> implement the point class provided by an ideal geometry library,
> right? If so, I think it's not the most important concern since, as
> I've already said, the primary goal is not to provide a point class
> but to define how points are accessed and defined in general. I think
> that, by talking about tuples, Brandon was rather pointing out that
> points should be accessible the same way as tuples: get<N>(p). Which
> is quite a general consensus now, IMO.
OK, perhaps I misinterpreted Brandon's comment. I have been getting caught
up on the March discussions, so apologies if I rehash old debates. If he was
simply advocating compile-time indexing via get<N>(p) or similar syntax,
then I completely agree. This should be required by the Point concept (or
Coordinate concept? I do not quite understand the distinction).
On the other hand, I'm worried to see that the geometry::point
implementation (at least the version included with SpatialIndexes) supports
only compile-time indexing. I think we must also have a RuntimeIndexable
concept, to which a particular point type may or may not conform.
RuntimeIndexable would require an indexing operator, so it can only be
modeled by homogeneous points. Iterators via begin() and end() could be
useful as well. I have a couple of reasons for wanting this, as usual
motivated by high dimensional spaces...
First, some algorithms require runtime indexing. kd-trees are an interesting
example. When constructing a kd-tree over a data set, we need some heuristic
to decide which dimension to split on at each tree node. In 2 or 3
dimensions, usually you simply cycle through the dimensions in order, for
which in principle only compile-time indexing is necessary. But this
heuristic is not so appropriate in high dimensions; then it is usual to
split each tree node on the dimension with the greatest variance in the
data, which of course is unknown at compile time.
Second, there are efficiency and code size concerns. Consider the Euclidean
distance function. For 2D/3D points, it is nice to use compile-time indexing
and metaprogramming to guarantee tight, loop-less code. For high dimensional
points, however, beyond some dimension
(BOOST_GEOMETRY_MAX_UNROLL_DIMENSION?) unrolling the loop is no longer
desirable, and we would rather just do the runtime iteration. If the points
are RuntimeIndexable, we can choose an appropriate distance algorithm at
compile time depending on the points' dimensionality.
I do think you are on the right track, and I look forward to seeing preview
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk