Boost logo

Geometry :

Subject: [geometry] [index] r-tree concerns
From: Adam Wulkiewicz (adam.wulkiewicz_at_[hidden])
Date: 2012-06-22 07:10:05


As you know (or not) we have two versions of the r-tree. Version 1 is in
Boost.Geometry repository and Version 2 is in the Sandbox:

Version 1 features:
- run-time polimorphic nodes with a lot of virtual functions, "classic"
creation algorithm.
- only one, linear creation algorithm,
- run-time MAX and MIN parameters,
- "classic" pair<Box, Data> value type handled.

Version 2 features:
- nodes are handled by visitor concept. I'won't be diving into details
but the key point is that the algorithm were rewritten and e.g. for 1M
points I achieved 20x speedup.
- multiple creation algorithms,
- compile-time MAX and MIN parameters,
- multiple, even user-defined value types handled.

I'd like to prepare this version to release but have a few concerns
about it:

1. Should MAX and MIN be compile-time? rtree.v2.01 took run-time MAX and
MIN. It was just a little slower but took more space than the current
one because static size nodes were unable to use (vectors takes more
space). But what are real-life use cases?

2. Both versions' queries returns values by copy. This means that if you
want to modify your values, you must store them somewhere else. When
e.g. The difference is that rtree.v2 may take various value types. If
e.g. Point is passed to the rtree.v2, the container behaves as std::set,
it returns copies of Point. If std::pair<Point, Data> is passed it
behaves as std::map. But the difference between the rtree and std
container is that they returns iterators not copies. What is more from
known reasons they make Keys const and returns iterator<const Point> and
iterator<pair<Point, Data>> respectively.

The main difference between std containers and rtree is that rtree's
iterators probably won't be preserved after the modification of the
rtree because internally values are stored in arrays/vectors. So, if the
interface is too close to the std containers users may assume that it
works exactly the same way.

My questions are:
Should rtree.v2 return some iterator<const Point> or
iterator<std::pair<const Point, Data>> respectively?
Maby there should be two query interfaces one returning copies, one
returning iterators?
What should be the interface of the rtree? Maby there shouldn't be only
one rtree<Value> but rtree_set<KeyValue> and rtree_map<Kay, Value>?

3. The reason why rtree takes rtree<Value> and dispatches it's type is
that it allows to use user-defined value types. It uses a Translator
object to get a Key from Value. The default one handles Points, Boxes,
std::pairs, pointers and iterators. The user may provide whatever type
he wants along with the translator. I now realized that this may be not
safe because he may modify the Key outside the rtree after the
insertion. Should this stay this way? or allow only predefined types
e.g. Point, Box, pair<Point, T>, pair<Box, T> and tuples?

Still, if someone wants to brake something he will do it. Even with std
containers one may use some e.g. wraped pointer as a Key and mess up
with it.

4. It is related to the previous point

Default translator uses SFINAE, which I believe won't work on all
compilers. Should there be:
- hard-coded types instead (less default functionality but someone may
still write his translator and to be honest those additional types
probably won't be used),
- some #ifdef BOOST_SFINAE_SUPPORTED (different interface for some
- as is (don't compilable on some compilers)?

So, that's it. Thanks for reading this not so short e-mail ;)


Geometry list run by mateusz at