Boost logo

Boost :

From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2005-10-01 18:09:20


Hi to all,

  I've missed the thread but since I've seen Shmem is mentioned I
couldn't resist...

> The one killer limitation of shmem (that I'm pretty sure Ion is working hard
> to remove) is that the shared memory region cannot be grown once it has been
> created. This is where your memory-mapped "persist" library has a leg up.

The problem is quite hard to solve if you allow shared memory to be
placed in different base addresses in different processes. And
performance would suffer if every pointer access I should check if the
memory segment it points is already mapped. To identify each segment, a
pointer should have the name of the segment and an offset. Each access
would imply discovering the real address of such segment in the current
process accessing the pointer. Really a hard task to do and performance
would suffer a lot. I think previous efforts ("A C++ Pooled, Shared
Memory Allocator For The Standard Template Library"
http://allocator.sourceforge.net/) with growing shared memory use fixed
memory mappings in different processes. But this is an issue I would
like to solve after the first version of Shmem is presented to a review
(I plan to do this shortly, within two months)

Memory mapped files are another thing. Disk blocks can be dispersed in
the disk but the OS will give you the illusion that all data is
contiguous. Currently in Shmem, when using memory mapped files as memory
backend, if your memory mapped file is full of data, you can grow the
memory mapped file and remap it, so you have more data to work. An
in-memory DB can be easily implemented using this technique: when the
insertion in any object allocated in the memory mapped file throws
boost::shmem::bad_alloc, you just call:

named_mfile_object->grow(1000000/*additional bytes*/);

and the file grows and you can continue allocating objects. Take care
because the OS might have changed the mapping address. In Shmem you can
obtain offsets to objects to recover the new address of the remapped
object. You can use the same technique with heap memory. The trick in
Shmem is that to achieve maximum performance, the memory space must be
contiguous. For growing memory, and persistent data, memory mapped files
are available in Shmem. Maybe is not enough for a relational DB, but I
would be happy to work with RTL library on this.

I've downloaded RML and I've seen that "mt_tree" class uses raw pointers
in the red-black tree algorithm. If you use memory mapped files and you
store raw pointer there, this file is unusable if you don't map it again
exactly in the same address where you created it. All data in the memory
mapped file must be base-address independent. That's why Shmem uses
offset_ptr-s and containers that accept this kind of pointers.

So if we want to achieve persistence with RTL we must develop base
independent containers. This is not a hard task but porting, for
example, multiindex to offset_ptr-s, is not a one day issue.

Regards,

Ion


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk