Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-28 14:20:24

On Sat, Jan 29, 2011 at 2:51 AM, Nevin Liber <nevin_at_[hidden]> wrote:
> On 28 January 2011 12:22, Dean Michael Berris <mikhailberis_at_[hidden]>wrote:
> The fact that contiguous pages don't get stored in contiguous physical
>> memory compounds the problem even more. You feel this a lot when your
>> system starts thrashing and the VMM does the work of swapping pages
>> and re-arranging page tables for you. The round-trips between a CPU
>> page fault and the VMM page fault handler along with potentially
>> invalidating L1/L2/L3 caches costs are significant in environments
>> where you need to get as much things done per second as physically
>> possible. If you can avoid having to go through this dance with a sane
>> memory management strategy, I think that's a win even in the trivial
>> case.
> If I have to keep data in physical memory, the way to use the least number
> of pages is to store it contiguously with respect to virtual memory.
> Anything else will (except for degenerate cases) require *more* physical
> pages of RAM.

More pages doesn't mean more contiguous pages. If you really needed
contiguous pages you already have those containers in place.

I never said that the segmented storage for strings will require less memory. ;)

> There are many other reasons to break something up into a segmented data
> structure, but avoiding thrashing when storing the object is not one of
> them.

Hmmm... ? So how is avoiding the (mostly unnecessary) paging in/out by
not requiring contiguous pages not a way of avoiding thrashing? (Too
many negatives, too tired to edit that message).

> Honestly, trying to argue that your new class is more efficient *on all
> fronts* is futile.  You'd do far better showing the tradeoffs you are
> making; which things are more efficient, and being honest about which things
> are less efficient.

Really, look -- I didn't say my new class is more efficient on all
fronts, I think you're arguing a strawman here.

What was being discussed in this particular line of discourse is the
effect of contiguous pages of memory in a data structure like
std::vector or std::string. For the cases that you don't need these
contiguous pages like in immutable strings, not using contiguous pages
is a good thing, better than unnecessarily forcing the use of
contiguous pages. It wasn't about the saving of memory, it was about
limiting the potential for fragmentation and VMM paging involvement.

Also, read the paper now that it's there. This is getting tiring
really especially when a strawman red herring is thrown in once in a

> It depends on what you're trying to do and what algorithms you're
>> applying on your std::dequeue. This is the whole crux of the segmented
>> iterators debate/paper by Matt Austern and others. Segmented versions
>> of the algorithms would make using std::dequeue a lot more preferable
>> if you really needed a data structure that doesn't promote heap
>> fragmentation as much as a growing std::vector does.
> GIven that a deque requires a new heap allocation for every N elements, how
> does it avoid heap fragmentation?

Consider the flip side with an std::vector which doesn't have a
reallocate option. It requires a *larger* heap allocation everytime it
grows. Now riddle me this, what happens to the memory the vector used
to occupy? Doesn't having irregular sized allocations promote more
fragmentation than having regular sized allocations?

Having same-sized segments is already a good thing for cases where you
have an optimal way of allocating pages -- and most modern OSes have
these. Growing a std::queue introduces less fragmentation and
therefore less thrashing by making the sizes of the segments tuned and
aligned properly.

> And I think this is straying off-topic from Boost...

I agree. Let's stop this talk now.

Dean Michael Berris

Boost list run by bdawes at, gregod at, cpdaniel at, john at