Boost logo

Boost :

Subject: Re: [boost] [devector] optimum growing/insertion policy
From: Thorsten Ottosen (tottosen_at_[hidden])
Date: 2017-10-16 17:18:29


Den 16-10-2017 kl. 01:45 skrev Ion Gaztañaga via Boost:
> On 15/10/2017 17:59, Joaquin M López Muñoz via Boost wrote:
>> The assumption is of an ever growing sequence. That said, I recorgnize
>> there are
>> scenarios not falling under this, most notably FIFO queues. I've given
>> the policy
>> a little more thought and think we can twist it to also accommodate
>> non-growing
>> scenarios, please keep on reading.
>>
>> Thougths?
>
> I think that the default (we could have other policies) should be simple
> and similar to vector/string/stable_vector/etc....

Yes, let's keep it simple.

> I propose a simple strategy:
>
> Definitions:
> ---------
>
> capacity() = the size of the buffer
> free_capacity() = capacity() - size()
> xxx_free_capacity() = the number of unconstructed items in the xxx (xxx
> = left or right) side of the buffer.
> N = the number of items to be inserted
> GF = growth factor
>
> Strategy:
> ---------
>
> 1) devector is initialized with equal free capacity for both ends.

I assume that implies reserve() does the very same?

> 2) If a xxx (xxx = left or right) insertion is needed and
> xxx_free_capacity >= the number of items to be inserted, a xxx insertion
> is done (minimizes moves).
> 3) Otherwise if a xxx (xxx = left or right) insertion is needed and
> xxx_free_capacity < N, the sequence containing new plus old elements
> should be placed in the middle of the buffer. (Note: This step should be
> limited if amortized constant time is desired for push_xxx insertions)

There might be some corner cases here. Suppose we have
front_free_capacity() == 3, back_free_capacity() == 0 and is about to
insert to the right half. Does it then make sense to place elements in
the middle?

If we shift by one element position we pay (on average)

  size * 3/4 moves

If we move to the middle, we have to move everything and pay

  size * moves

We now have two (average) insertions to gain the lost moves before we
allocate. One will be in the left side, so has no impact. The other
insertion will now take

  size * 1/4 moves

if we placed in the middle and

  size * 3/4 moves

if we shifted. So shifting leads to

  size * 6/4 moves

and moving to the middle leads to

  size * 5/4 moves.

So I agree, we should probably always move to the middle if the free
space >= 3 (at the other end). This applies to situations where
reallocating is not desired.

> Worst case: for a repeated one-side insertion, when N/2 insertions are
> performed, the first relocation step will take place. In general,
> log(N/2) relocation steps (each one with a increasing number of moves)
> will be performed before reallocating a new buffer. I don't think it's
> amortized constant time, but amortized logarithmic time (but see below).
>
> Extra moves can be reduced by imposing a capacity() that is a factor
> (say reallocation factor, RF) of the buffer size, forcing to
> reallocation when the load factor is high. This means that if size() is
> capacity() = RF*internal_buffer_size, even if there is room for the
> left/right insertions a reallocation will be performed. This means that
> capacity() = RF*internal buffer is known and means something to the
> user. Memory usage will grow, but data movement will be minimized. RF
> could be a policy-based parameter.
>
> I think the RF factor obtains an amortized constant time for one side
> insertion if that property is considered useful.

I think that is a must.

> A fast choice would be to allow just two relocations, a RF of 7/8 =
> (0,87) which would mean aprox. 2,6*N moves to push_back N elements. A RF
> of 3/4 (0.75, single relocation step) would mean 1,33*N moves to
> push_back N elements.

I like low numbers here.

> For a FIFO, I think a deque that reuses free segment capacity on one
> side and transfers it to the other side before allocating new memory
> would be the best choice. Currently boost::container::deque does not do
> this, but pull requests are welcome ;-)

Or maybe just boost::circular_buffer (is this not the optimal way to
handle that case?). No one is saying that devector has to solve every
problem.

I do want to focus on simple properties from a user's perspective. Some
cases to consider:

# case 1: push_back from empty state
------------------------------------

   vector<int> v;
   v.reserve( N );

This guarantees N pushes without exceptions and without reallocation.

# case 2: push_back from unknown state
--------------------------------------

   vector<int> v;
   ...
   v.reserve( v.size() + N );

This guarantees N pushes without exceptions and without reallocation.

# case 3: insert from empty state
---------------------------------

   vector<int> v;
   v.reserve( N );

This guarantees N inserts without exceptions and without reallocation.

# case 4: insert from unknown state
-----------------------------------

   vector<int> v;
   ...
   v.reserve( v.size() + N );

This guarantees N inserts without exceptions and without reallocation.

How do these properties work for a devector? If I understand your
strategy, we get the following:

# case 1: push_back from empty state
------------------------------------

   devector<int> v;
   v.reserve( N );

This does not guarantee N pushes without exceptions or without
reallocation. To get that guarantee we have to write

   devector<int> v;
   v.reserve_back( N );

# case 2: push_back from unknown state
--------------------------------------

Same as for case 1: no guarantees unless reserve_back is called.

# case 3: insert from empty state
---------------------------------

   devector<int> v;
   v.reserve( N );

This does not guarantees N inserts without exceptions or without
reallocation. Using reserve_back or reserve_front doesn't help either.
The only thing that works is if insert defers all reallocation till it
is absolutely needed.

# case 4: insert from unknown state
-----------------------------------

Same as for case 3: no guarantees unless insert exhaust memory.

There are various ways to fix that: either the user replaces

    v.reserve( N )
    v.reserve( v.size() + N )

with

    v.reserve( N * 2 )
    v.reserve( v.size() + N * 2 )

or reserve() automatically acquires double the extra memory requested.
The latter will also make generic code work as expected. Using the
double amount of requested memory automatically guarantees the optimal
insertion performance: no move to the middle is ever required and on
average devector performs half the moves of vector.

If the user wants less memory waste for push_front/push_back, he can use
reserve_front/reserve_back.

If the user wants full capacity usage for insert, he can only get that
if we exhaust memory for insert by moving to the middle and make reserve
allocate exactly what he asks for. Even so, devector should perform no
worse than vector, and likely somewhat better.

Unfortunately, the is no scheme that is 100% compatible with vector if
reserve() for devector produces front_free_capacity() ==
back_free_capacity(). For that to work, reserve() must act as
reserve_back().

Would that be undesirable for insert of reserve() is modeled after the
vector? What would happen is that the first time an element falls in the
front half, the elements are moved to the middle. This is probably
acceptable compared to removing reserve() altogether because its
semantics are ambiguous.

How do we get the optimal performance for flat containers? We can still
call reserve( 2 * N ) and then only pay one move to the middle
operation. This is not 100% optimal, granted. For that to happen flat
containers would need to forward additional arguments for reserve().
(and possibly clear() as well)

How bad is it to waste 50% memory for optimal flat containers. If the
container is local in a function, it is probably of no concern at all.
Otherwise one may use trim_to_fit() if that applies. At any rate, it
would be the user that decides.

kind regards

Thorsten


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk