Boost logo

Boost Users :

Subject: Re: [Boost-users] Interprocess : vectors of vectors
From: Anthony Foiani (tkil_at_[hidden])
Date: 2013-05-16 08:52:45


Oodini <svdbg____at_[hidden]> writes:

>> If you have plenty of memory (especially virtual memory), then just
>> make it huge, and don't worry about using it all.
>
> Actualy, my data will take about 26 Gb...

Ouch!

> Tha's why I want to control the memory.

Yeah, if you can't fit it all in core, then Boost.Interprocess isn't
going to be enough.

(On the other hand, depending on how expensive your programmers are,
remember that fast computers with 64GiB RAM are no longer outrageously
expensive, even as rented hosts. A quick look found me a 32GiB system
for about 1500USD, 64GiB for 2500USD. Here in the USA, it's not
unusual to figure 3k-5k USD/wk for developer time, and if it takes you
more than a week to work around the memory limit, then you're better
off just getting the new machine. But anyway...)

>> there might be a bit of extra bookkeeping in the
>> vector "header" itself.
>
> That's one of the problem.

I'd be surprised if the interprocess vector cost more than another
pointer or two over the regular std::vector. They both need the extra
space for capacity vs. size, pointer to start of memory, etc.

> reserve(1) or reserve(5) take the same amount of memory.

Right, because at the lowest level, they either have internal
optimizations that limit the options, or the level "under" the vector
will only give out a certain smallest size of memory block.

>> b. chunks of memory are often larger than actually requested (to
>> improve performance, or to stash housekeeping information with
>> each chunk).
>
> Yes. I'd like to control this behaviour.

You can't really control it without reworking your entire memory
allocation infrastructure from scratch.

Even malloc/free tends to use this trick: if you requested, 0x200
bytes, it would actually allocate 0x204 bytes, put the value 0x200 in
the first 4-byte word, and then hand you the pointer to
allocated_range+4).

That's how "free" knew how big of a chunk it was using, without
relying on an external lookup table.

>> So whatever value you get for "total_memory", remember that you
>> need to increase it by some amount.
>
> As we are supposed to provide an exact value for the shared memory,
> there should be a way to know exactly how much memory consume the
> data structures.

Using shm from C++ through Boost.Interprocess, I've always just given
it extra space. On a machine with virtual memory, it should only
consume as much RAM as you're actually using in the segment, not the
entire segment.

I've also had the luxury of working on problems that fit easily into
their hosts, e.g., a few megabytes on a multi-gigabyte machine. So
the question has never really come up for me.

>> If you have the memory to spare, go for double, and you should be
>> fine; if you're tight, try maybe 20% more, and pay extra attention
>> to your exception paths.
>
> I can't...
> Thanks a lot for your contribution.

You're welcome. Sorry I didn't have better answers for you.

> I switck back on an implementation based on pointers instead of vectors.

So you do have enough RAM to fit it all in core, but not with the
overhead of the vector structures? Interesting.

You might see if there are ragged / sparse matrix classes that could
be adapted to Boost.Interprocess; those might be closer to your use
case than the general-purpose vector-of-vectors.

Good luck!

Best regards,
Anthony Foiani


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net