Subject: Re: [geometry] Crash in rtree after growing shared memory, VS2012 x64, Boost 1.55
From: Tim Finer (tim.finer_at_[hidden])
Date: 2014-07-26 00:08:21
Thank you for all the details and for all your work and also for a same day response!
> On Jul 25, 2014, at 6:16 PM, Adam Wulkiewicz <adam.wulkiewicz_at_[hidden]> wrote:
> Hi Tim,
> Tim Finer wrote:
>> I really want to use a persistent rtree backed by shared memory (or a memory mapped file). I don't know in advance how large the data. I experimented with resizing a boost::vector backed by a shared memory allocator and applied what I found there to the example at http://www.boost.org/doc/libs/1_55_0/libs/geometry/doc/html/geometry/spatial_indexes/rtree_examples/index_stored_in_shared_memory_using_boost_interprocess.html.
>> What I did was wrap an insertion function that catches a bad_alloc, closes the shared memory segment, grows the shared memory, then resets the pointer to the rtree. I've verified that this technique works fine for boost::vector, but not for the rtree, because it appears that there are some other structures internally that are possibly noting the (old) capacity of the shared memory?
> If an exception is thrown somewhere during the insertion the rtree is left in the inconsistent state, see the warning:
Wow, I should've read that first!
> In this case the allocation of the memory for additional node during the balancing procedure failed and the extra element was left in the node.
> The rtree only guarantees that there will be no memory leaks after exception is thrown.
> But yes, an rtree should probably be left in more consistent state (basic guarantee) and there should be no assert/segfault when someone wanted to modify the rtree after the exception was thrown.
> The copying won't work because some parts of the rtree may dissapear. The rtree doesn't guarantee strong exception safety.
> [Extended description]
> The support for strong guarantee in the insertion/removal/reinsertion would require to prepare all changes beforehand and then to apply them all at once. E.g. allocate all required nodes, fill them with data, somehow store this changeset and then when everything was ok, only replace the pointers. In the meanwhile all changes should probably be tracked and it should be possible to work with the the "temporary" structure transparently. I'm guessing that it would introduce some performance penalty and that the no-leak or basic exception safety is a "natural" exception safety level for an rtree.
Sounds like this might be another policy to bind, since it is expensive. Having a strong guarantee would be especially useful for a disk based data structure.
> Why I think that this is how strong exception guarantee should be implemented?
> It's because the balancing procedure may be quite complicated e.g. if the value is removed from the R*-tree and there is an underflow in a node detected, this node is removed and remaining elements are inserted into the rtree again. If during such insert an overflow is detected the forced reinsertion procedure may kick in and some number of elements will be removed from overflowed node and reinserted again. If during such forced reinsertion again an overflow occurs the new node is created and splitting procedure is applied. And this is done for each inserted element or a child node at every level of the rtree if necessary. So if at some Point and exception is thrown in the allocator or Value's copy ctor or whatever, we're somewhere in the middle of the procedure described above and without tracking what happened before we can't do anything.
Wow, this is a tough problem. I've typed a few thoughts out, only to delete them moments later. You almost need a way to store the entire state of the tree first (I know that's not feasible).
>> I also noted in a previous thread a similar problem, and applied the fix on github, but that didn't help (http://boost-geometry.203548.n3.nabble.com/rtree-crash-when-used-with-inter-process-td4026037.html).
> Yes, because this was related to the way how nodes are stored in the memory and in your case the exceptions are the problem.
> Btw, in the example, shouldn't the new size be = mem->get_size() * 1.5 ?
You'd think it would be but the argument to grow is in "additional bytes".
>> I'm not married to the way I'm doing this, and welcome any suggestions or workarounds. My other thought is copying the entire tree into a new (larger) shared memory segment, but I'm sure that'll be a huge performance hit.
> In order to work the way you want the rtree requires an allocator which will be able to increase the size of shared memory if needed.
> I'm not sure if this is possible in general and/or with Interprocess.
> I'll try to play with it in my free time.
> If you were able to come up with something before me, please let me know.
Some other options I'm looking at are:
1 Scanning all the data first (it isn't in a convenient format, so I'll need to store it in another mapped file), then allocating enough space for the rtree.
2. Read up on allocators and trap the exception down at a lower level, and possibly do the resize there. There are no guarantees about the newly allocated pointer's location after growing.
3. Monitor the capacity of the shared file and proactively resize given a threshold (say 90%). I just thought of this one and I like it because it sidesteps the entire exception problem altogether. The more I think about it, the more I like this one.
I'll let you know what I find.
> Geometry mailing list
Geometry list run by mateusz at loskot.net