Boost logo

Geometry :

Subject: Re: [geometry] Crash in rtree after growing shared memory, VS2012 x64, Boost 1.55
From: Adam Wulkiewicz (adam.wulkiewicz_at_[hidden])
Date: 2014-11-21 19:59:21


Hi Tim,

Tim Finer wrote:
> Hi Adam,
>
> It's been a few months, but I finally got back to trying out the bgl
> rtree with a resizable memory mapped file. Things are looking up and
> I have another question.
>
> I've edited our correspondence down to the essentials:
>
> On 7/26/14, 4:08 AM, Adam Wulkiewicz wrote:
>> Hi Tim,
>>
>> Tim Finer wrote:
>>>> On Jul 25, 2014, at 6:16 PM, Adam Wulkiewicz
>>>> <adam.wulkiewicz_at_[hidden]> wrote:
>>>>
>>>> In order to work the way you want the rtree requires an allocator
>>>> which will be able to increase the size of shared memory if needed.
>>>> I'm not sure if this is possible in general and/or with Interprocess.
>>>> I'll try to play with it in my free time.
>>>> If you were able to come up with something before me, please let me
>>>> know.
>>> Some other options I'm looking at are:
>>>
>>> 1 Scanning all the data first (it isn't in a convenient format, so
>>> I'll need to store it in another mapped file), then allocating
>>> enough space for the rtree.
>>
>> FYI, the packing algorithm done in the constructor (when you're
>> creating the rtree from e.g. a pair of iterators) should construct an
>> rtree having nodes "packed" as tightly as possible, you could
>> actually predict what'll be the actual number of nodes created. At
>> least for currently used algorithm. If you're curious see line 109 of
>> https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/index/detail/rtree/pack_create.hpp.
>> So maybe it'd be a good idea to first load the objects to a vector,
>> calculate their size, then create a shared memory and construct an
>> rtree using pack-create.
>> Though I fear that if the memory could be fragmented this might not
>> work in all cases.
>>
> I want to avoid a first pass and would prefer to trade space for
> time. Is there a way that I could "actually predict what'll be the
> actual number of nodes created" given a live rtree? I'd like to do
> this based upon the currently populated rtree so that I could find the
> worst case scenario and allocate at least that much memory in a node
> insertion wrapper.
>

I think it could be possible. Since 1.56 or 1.57 (I don't remember) no
temporary containers should be created using passed allocator, only
nodes. In the worst case the rtree would have to create 1 leaf node + N
= depth new internal nodes per insert. Unfortunately there is no
official way of checking the depth. You should be able to do it like this:

bgi::detail::rtree::utilities::view<rtree_type>(rtree_obj).depth();

but it's non-official and could be removed/changed in the future.

>>
>>> 3. Monitor the capacity of the shared file and proactively
>>> resize given a threshold (say 90%). I just thought of this one and I
>>> like it because it sidesteps the entire exception problem
>>> altogether. The more I think about it, the more I like this one.
>>
>> Yes, this is also worth trying.
>> It seems that there is a get_free_memory() method in
>> managed_shared_memory so the amount of free space could be easily
>> returned.
>> In case if the memory could be fragmented you could try to allocate
>> some bigger block and see if it succeeds. Then resize the memory if
>> necessary.
>> When the exception is thrown get_free_memory() returns 968 in my test
>> and I'm starting from 1024 * 64.
>> The author of Boost.Interprocess could probably give us some pointers
>> about how big the threshold should be and if the fragmentation should
>> be taken into account.
>>
>
> Hey! This worked. I arbitrarily chose 90% and resized at that point
> and my initial tests work! What I'd like to do know is figure out how
> much memory would require. I'm not terribly worried about
> fragmentation (yet), but will get to that after I run some tests with
> finding appropriate thresholds.

So you probably would also need a size of nodes for the esstimate.
Unfortunately it depends on the node type and there is no official way
of obtaining it.

So again, everything described below is non-official and may be changed
in the future.

Currently (1.57) Boost.Variant-based nodes are used by default (this may
change in the future, in fact I plan to do it at soem point). Static
nodes stores varrays/static_vectors, dynamic nodes uses
boost::container::vector<>. So in the case of the container::vector<>
the ammount of memory would also depend on the allocation strategy of
this container. AFAIR this implementation of vector<> allocates only as
much as it needs but I'm not sure.

For Variant-based static nodes you should be able to check the size like
this:
sizeof(
bgi::detail::rtree::utilities::view<rtree_type>::allocators_type::node_allocator_type::value_type
)

For dynamic nodes it'd be more complicated since you should add the
container::vector<> storage which'd be in the worst case equal to
N*sizeof(Value) for leafs and around N*sizeof(std::pair<Box,
NodePointer>) for internal nodes. AFAIR in the worst case N should be
equal to MAX+1-MIN in the newly created nodes. Box would be of type
model::box<model::point<>> using the same coordinate type and dimension
as the Indexable. NodePointer would be a type returned by allocator
which you could get like this:
bgi::detail::rtree::utilities::view<rtree_type>::allocators_type::node_pointer

Because of all of this non-official stuff I'd consider it rather a
workaround than a solution.

Have you thought about the implementation of an allocator managing
arbitrary number of Boost.Interprocess allocators/shared files? It could
own some number of BI allocators and try to create a block of memory
using them and if it wasn't able to do it it could create another file.

Regards,
Adam


Geometry list run by mateusz at loskot.net