Hi Adam,
 
Box regions aligned WRT a grid corresponding to the lowest level of the octree could be stored in the Rtree, this way the same effect could be achieved.

Ray queries are possible in the R-tree, they're even experimentally supported in the Boost.Geometry rtree. E.g. see: https://github.com/boostorg/geometry/blob/develop/index/example/benchmark_experimental.cpp but it should also be available in the release 1.55. They're called path queries because Segments and LineStrings can be passed as an argument.
This functionality isn't officially released yet because:
1. It gives correct results only in the cartesian CS
2. I'm not sure if this shouldn't be replaced by some more general user-defined nearest predicate

Btw, why is the ray query needed? Wouldn't a spatial query be enough to get the sample's corresponding voxel to update it?

The ray query is needed because data about both free and occupied regions is useful. One nice property of range sensors is that if you see an occlusion, like an obstacle, ground, walls, etc. you know that the ray between the sensor and the occlusion is free space as opposed to undefined/unknown space. This is critical for safety guarantees while navigating because regions that are known to be unoccupied are safe to enter. On the other hand regions that are occupied (depending on what's there) or for which there is no data are a gamble and don't carry the same guarantee of safety.
 
Or maybe the closest voxel must be updated because on the obstacles' edges samples distances may be noisy?

Yes, noise is also a big concern, such as variability in sampling location and value. It is possible to sample the same location twice, but due to noise, motion, etc. the two samples are not identical in terms of position, value, etc. This is where probabilistic robotics comes in to help account for this variation.  

This also brings up another important question about R-tree, what are the dimensional limitations of this implementation, can simple geometries such as a point, box, sphere or ellipse be handled in 3 dimensions, 4 dimensions, or even 100 dimensions?
 
I'm not sure how performance would compare, and I haven't tested it. It may vary a lot based on how sparse/dense the data is and the number of entries. I would hypothetically expect the octomap to have an advantage as density increases since with fixed voxels repeated entries at the same location would affect confidence levels rather than increasing the total amount of data stored. Perhaps there would also be a similar way to elide data with R-trees as well and thus mitigate that possible source of performance differences.

Yes, as I wrote above, the R-tree could be used to always store Boxes of a grid of the lowest level of the quadtree - voxels. Since the R-tree is an index and its interface doesn't allow the modification of the stored values, voxels geometrical data (Box) should be stored with some ID, and voxels data should be stored in some container outside the R-tree. Or some custom updating visitor could be used for this, but this is not officially supported.

I see, this is one of the interesting details I was hoping to find out. So essentially this should be used as an index for fast lookups, rather than a container for the real data.

In my possible applications one of the challenges is that everything is (soft) real time, so new data is constantly arriving and the geometries are not static. For instance, a car, person, or arm will often move between time steps.

What are the limitations here of a spatial index? 
  • Could it eventually be used to store mutable data, or would that pose a fundamental problem?
    • Perhaps there is more than one level at which data is "mutable" such as data/geometries that change location vs fixed data/geometries that change a stored value?
    • What are the implications of a custom updating visitor, given that it is unsupported?
      • Would it definitely break things or probably work?
      • Would it be extremely tricky or relatively straightforward?
      • Could a custom updating visitor update the locations of data or just the value?
    • Could shared pointers to objects be stored?
  • Would it make sense to use the index to access or cross reference dense data such as pixels of a depth image and points from a laser range sensor, or is it best for sparse data sets like querying higher level objects such as a "ground region" or "people on the ground" after processing raw data?
  • What would be the best way to index or track objects/data that move? 
    • Examples include an object representing a person walking around or joints moving on an arm.
    • Perhaps the index would need to be reconstructed, or objects removed and reinserted at each step to incorporate the new data?
    • Would one be more efficient than the other due to bulk loading?
I know the answer to some of these questions may vary dramatically from one specific problem/implementation to the next, but it is helpful to at least get an idea of what is known, unknown, and what is highly variable. 

Thanks a lot for your help and answering my questions!

Cheers!
Andrew Hundt