Boost logo

Geometry :

Subject: Re: [geometry] R-tree vs octomap
From: Adam Wulkiewicz (adam.wulkiewicz_at_[hidden])
Date: 2014-05-27 18:22:33


Hi,

Andrew Hundt wrote:
> This also brings up another important question about R-tree, what are
> the dimensional limitations of this implementation, can simple
> geometries such as a point, box, sphere or ellipse be handled in 3
> dimensions, 4 dimensions, or even 100 dimensions?

Yes, as far as your compiler can compile such deep recursive tamplates.

> I'm not sure how performance would compare, and I haven't
> tested it. It may vary a lot based on how sparse/dense the
> data is and the number of entries. I would hypothetically
> expect the octomap to have an advantage as density increases
> since with fixed voxels repeated entries at the same location
> would affect confidence levels rather than increasing the
> total amount of data stored. Perhaps there would also be a
> similar way to elide data with R-trees as well and thus
> mitigate that possible source of performance differences.
>
>
> Yes, as I wrote above, the R-tree could be used to always store
> Boxes of a grid of the lowest level of the quadtree - voxels.
> Since the R-tree is an index and its interface doesn't allow the
> modification of the stored values, voxels geometrical data (Box)
> should be stored with some ID, and voxels data should be stored in
> some container outside the R-tree. Or some custom updating visitor
> could be used for this, but this is not officially supported.
>
>
> I see, this is one of the interesting details I was hoping to find
> out. So essentially this should be used as an index for fast lookups,
> rather than a container for the real data.
>
> In my possible applications one of the challenges is that everything
> is (soft) real time, so new data is constantly arriving and the
> geometries are not static. For instance, a car, person, or arm will
> often move between time steps.
>

AFAIU this is different use-case than the one mentioned above, where
samples are stored?

> What are the limitations here of a spatial index?
>
> * Could it eventually be used to store mutable data, or would that
> pose a fundamental problem?
>

It could but is not supported. Otherwise the user could modify the part
of the stored Value keeping the geometrical data which would brake the
index. This is similar thing to the std::set<> storing const Value and
std::map<> storing const Key. And it's a spatial index, not container.

> o Perhaps there is more than one level at which data is
> "mutable" such as data/geometries that change location vs
> fixed data/geometries that change a stored value?
>

Initially I thought about implementing a Container or two Containers
with the interface similar to std::set<> and std::map<> but decided to
go into the direction of the Index capable to store arbitrary types.

We could of course return to the idea of Containers and think about what
should be done in order to implement some based on the R-tree code.
Should they be separate templates or could the rtree be modified? E.g.
what if the rtree for passed Value = std::pair<Box, Id> always store
std::pair<Box const, Id>. Would it brake users' code? Probably yes.
Should we rather implement geometry::container::rtree_map<Indexable,
Data> and always store std::pair<Indexable const, Data>? Etc.

> o What are the implications of a custom updating visitor, given
> that it is unsupported?
> + Would it definitely break things or probably work?
>

Custom, user-defined visitors would definietly work but might stop
compiling with some future release of Boost.Geometry if something were
changed internally.

> + Would it be extremely tricky or relatively straightforward?
>

It depends what you'd like to do. The visitors are quite simple to
write, they're similar to Boost.Variant static visitors. E.g. take a
look at
https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/index/detail/rtree/utilities/print.hpp
line 132. It's a visitor recursively printing the contents of the R-tree.

> + Could a custom updating visitor update the locations of
> data or just the value?
>

Definietly it could not just update the stored value. It might naiively
expand all of the ancestor nodes boxes but this would result in great
overlap of the nodes. So instead this moving_visitor should search for
the old value top-down remove it from the node (handle underflow
locally), search for the place to insert the new one somewhere near the
old node bottom-up and down again (handle overflow locally) and then
handle underflow and overflow globally bottom-up. To implement it the
inserting and removing visitors of all balancing algorithms should
probably be searched for parts which could be reused, parts of the
algorithms run per level. This in fact could be a nice addon to the
R-tree and could be a part of the official interface.

It's very probable that in most of the cases the Value would be moved
inside the same node or somewhere very close. But still we'd be forced
to search for the old value which would be O(logN), however in the case
of the R-tree it could be quite ok since the R-tree might have small
number of levels. E.g. for Max = 16 to represent 1M Values one'd need 5
levels.

It would be possible to implement a container similar to std::map<>
providing non-invalidated Iterators to Values. They could be passed to
the Rtree to indicate a position of the removed Value (this would be
O(1)). And they had to be not invalidated by the remove() because
otherwise it would be not possible to remove more Values than one. To do
this we'd be forced to add some pointers to parents to the R-tree nodes
and to allocate the memory for each ValueNode separately but this could
be done. On the other hand, going bottom-up we'd be forced to search the
parent node for the child we came from (O(Max)). So we'd get the
non-invalidated Iterators and (probably) faster remove but the queries
would (probably) be slower.

The above are only my predictions since all of those things should
really be implemented and the performance measured to be sure.

> o Could shared pointers to objects be stored?
>

Yes, but:
1. it could be not safe since one could modify the object pointed by the
shared_ptr (and modify the Indexable data or in other words a Key),
2. it would probably be slower than storing the Geometry directly in the
R-tree because dereferencing shared_ptrs<> would ruin cacheing of Values.
So I'd advice storing std::ptr<Box, ID> or std::ptr<Box, shared_ptr<>>,
but one can use whatever he like.

> * Would it make sense to use the index to access or cross reference
> dense data such as pixels of a depth image and points from a laser
> range sensor, or is it best for sparse data sets like querying
> higher level objects such as a "ground region" or "people on the
> ground" after processing raw data?
>

It would make sense in both situations to use some kind of spatial
indexing or partitioning technique. In some use-cases the R-tree could
be better than in the others.

> * What would be the best way to index or track objects/data that move?
> o Examples include an object representing a person walking
> around or joints moving on an arm.
> o Perhaps the index would need to be reconstructed, or objects
> removed and reinserted at each step to incorporate the new data?
>

Yes, currently the old value should be removed and the new one inserted.

> o Would one be more efficient than the other due to bulk loading?
>

Bulk loading would definietly be faster than inserting each value
separately. I tested its speed and for Points it's as fast as the
kd-tree created in the contigeous memory using a recursive algorithm.
The difference in the time of rebuilding the whole tree and reinserting
values would depend on the number of values. If it were some big number
it would probably be faster to create the container from scratch.

> I know the answer to some of these questions may vary dramatically
> from one specific problem/implementation to the next, but it is
> helpful to at least get an idea of what is known, unknown, and what is
> highly variable.

A reasonable approach would be to pick some container/library which is
fairy simple to use and then check the performance, profile to find
bootlenecks and if they were caused by the container, test some other
one or think about an other algorithm.

Regards,
Adam



Geometry list run by mateusz at loskot.net