Boost logo

Boost :

Subject: Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30
From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2017-09-25 00:52:41


On Thu, Sep 21, 2017 at 12:38 PM, Thorsten Ottosen via Boost <
boost_at_[hidden]> wrote:

> Dear users and members of Boost,
>
> I'm proud to announce that the formal review of Benedek Thaler's
> Boost.DoubleEnded library starts today, September 21, and runs through
> September 30. This library is the result of Benedek's participation in
> Google Summer of Code 2015.

[snip]

> Please provide in your review whatever information you think is
> valuable to understand your final choice of ACCEPT or REJECT including
> Fit as a Boost library. Please be explicit about your decision.
>

I think you mean Boost.DoubleEnded, not Fit :).

I vote to reject Boost.DoubleEnded.

> Some other questions you might want to consider answering:
>
> - What is your evaluation of the design?
>

I don't think it is up to Boost's standards.

My first concern is that this library is centered around runtime
performance,
yet there appears to be no measurement done to determine what -- or even
whether -- optimization occurs when using this library instead of the
standard
containers.

My second concern is that the containers in this library are difficult to
reason about.

You'll find these two issues reinforced multiple times in the details of
the review below.

Here's the first snippet of code from the examples in the
online docs:

de::devector<int, de::small_buffer_size<16>> small_devector;

BOOST_ASSERT(small_devector.capacity() == 16); // but no dynamic memory was
allocated

small_devector.push_back(2);
small_devector.push_back(3);
small_devector.push_back(4);

small_devector.push_front(1); // allocates

There are no standard containers for which capacity() == 16 implies that
when there are three elements, an allocation will occur when I add a
fourth. That's substantially odd. It means that as a user I must know
about extrinsic state information (how many pushes were to the front or
back) in order to predict memory usage and the number of allocations even
roughly.

The next example is this:

de::devector<int> reserved_devector(32, 16, de::reserve_only_tag{});

for (int i = 0; i < 32; ++i) {
  reserved_devector.push_front(i);
}

for (int i = 0; i < 16; ++i) {
  reserved_devector.push_back(i);
}

That's not a compelling use case to me. Why can't I just do this:

std::vector<int> reserved_vector;
vector.reserve(48);

// Fill in the values, etc.

Why must there be an implicit zero-point, such that front-pushes allocate
before the capacity is reached, and back-pushes allocate before the
capacity is reached (but after a different number of pushes)? This is all
pretty wonky. If I only care about growth in one direction, I just use a
vector, and don't worry about the push-front case, because I have reverse
iterators. Otherwise, I'm inclined just to use deque, because I don't have
to consider these wonky semantics with that, and I have no measurements to
recommend the use of this library instead.

(
BTW, please create an object of your tag type:

de::devector<int> reserved_devector(32, 16, de::reserve_only_tag{});
->
de::devector<int> reserved_devector(32, 16, de::reserve_only);
)

For that matter, is the performance of devector better than std::deque when
I
don't know the relative frequency of front- and back-pushes? It seems in
many
cases that it would be quite a bit worse.

But it seems that no one knows -- not me, not the author -- because no one
has measured it. If I'm wrong about this, please share the measurements!

A custom growth policy must have an integral growth factor, but a commonly
used growth factor is 3/2.

I don't understand why should_shrink() is part of GrowthPolicy. If I'm
growing the container, why am I worried about shrinking it?

Another example from the docs:

de::devector<std::string> reversed_lines;
reversed_lines.reserve_front(24);

std::string line;

while (std::getline(std::cin, line))
{
  reversed_lines.push_front(line);
}

std::cout << "Reversed lines:\n";
for (auto&& line : reversed_lines)
{
  std::cout << line << "\n";
}

How is this better than doing:

std::deque<std::string> reversed_lines;

std::string line;
while (std::getline(std::cin, line))
{
  reversed_lines.push_front(line);
}

std::cout << "Reversed lines:\n";
for (auto it = reversed_lined.begin(), end = reversed_lines.rend());
     it != end; ++it)
{
  std::cout << line << "\n";
}

? If it's just to avoid the verbose for-loop, that's not enough for me. If
it's an efficiency claim, I have no numbers to back that up, and the use of
IO
in the example makes it moot anyway.

I need a justification for unsafe_push_front() besides "avoiding a check,
thus
sparing a few instructions". My intuition is that it does not matter, since
branch prediction is quite good these days, and a single allocation will
swallow all the gains if there are any. However, *measurement* is the only
thing that can make this justification, pro or con.

Again, from the example docs:

// int sockfd = socket(...); connect(...);
de::devector<char> buffer(256, de::unsafe_uninitialized_tag{});
long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0);

if (recvSize >= 0)
{
  buffer.unsafe_uninitialized_resize_back(recvSize);
  // process contents of buffer
}
else { /* handle error */ }

The unsafe resize does nothing in this case, since char has a trivial
dtor. For
nontrivially destructed types, we get RAII violations. I don't know how to
use a type that does that.

> - What is your evaluation of the implementation?
>

I did not look.

> - What is your evaluation of the documentation?
>

There are a lot of aspects of the docs I found to be unclear.

The Benchmarks section contains no benchmarks.

There is this:

"
Reference stability is an important feature of the batch_deque. Inserting
elements to the front or to the back of the sequence never invalidates any
references. Moreover, a special insert operation is provided, stable_insert,
which takes an iterator, as a hint, and inserts a sequence of elements (not
necessary directly) before the element pointed by the hint, in a way that
it doesn't invalidate references.
"

I find that very difficult to understand. It seems to be saying that
stable_insert() does not always insert where you wanted in the sequence,
because the insertion point is just a hint. I think what is intended is
that the insertion will not necessarily be contiguous (or at least I hope
that's what is intended).

These are other examples like this, but until the high-level design issues
are resolved, I don't want to take the time to get into all the specifics.

> - What is your evaluation of the potential usefulness of the library?
>

It does not seem useful to me, except that the ability to specify the
block-size for a deque would sometimes come in handy.

> - Did you try to use the library? With which compiler(s)? Did you
> have any problems?
>

I did not.

> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
>

I spent 4-5 hours.

  - Are you knowledgeable about the problem domain?
>

Yes.

Zach


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk