Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30
From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2017-09-26 01:39:10

Next message: degski: "Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30"
Previous message: Tim Song: "Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30"
In reply to: Thorsten Ottosen: "Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30"
Next in thread: Thorsten Ottosen: "Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30"
Reply: Thorsten Ottosen: "Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30"
Reply: Hans Dembinski: "Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30"

On Mon, Sep 25, 2017 at 12:17 PM, Thorsten Ottosen via Boost <
boost_at_[hidden]> wrote:

> Hi Zach,
>
> Thanks for spending a lot of time thinking about the library. Some minor
> points for discussion follow.
>
> My first concern is that this library is centered around runtime
>> performance,
>> yet there appears to be no measurement done to determine what -- or even
>> whether -- optimization occurs when using this library instead of the
>> standard
>> containers.
>>
>
> For devector:
> -------------
>
> Is it it not obvious that an O(1) push_front performs better than an O(n)
> vec.insert( vec.begin(), x ) ?
>

I find the use of the word "obvious" when applied to optimization to be a
little suspect. Also, that statement is not quite true. push_front() is
*amortized* O(1), not O(1). A particular call to push_front() is actually
O(N). But the larger point to me is that I can use vector when I care
about amortized O(1) growth in either direction. I only care about
devector or deque when I need amortized O(1) growth in *both* directions.
Then, it is a question of whether the relative performance recommends
devector or deque. As of this writing, I have to write benchmarks or do a
lot of custom profiling to figure out which works better.

My strong preference would be that the author provide sufficient
measurements of a few use cases of both of these data structures at various
values of N, so that I can make a reasonable educated guess when first
selecting the data structure to use. It won't allow me to elide profiling
altogether, but it will get me going in the right direction more often,
with less effort for me as a user.

> Is it not obvious that having a small buffer optimization can be useful?
> boost::small_vector does it. Nearly all std::string does it.
> stlsoft::auto_buffer has done it for a decade, described in detail in
> Matthew Wilson's book. Chandler Carruth had the following presentation last
> year
>
> https://www.youtube.com/watch?v=vElZc6zSIXM
>
> about small buffer optimizations use in Clang data structures.
>

You don't need to justify SBO to me; I'm all for it. We already have
boost::small_vector though, as you point out. Furthermore, that particular
optimization only pays off when N is small. When N is small, amortization
doesn't really pay off. So, again, since I would only use devector/deque
when I know I need growth in *both* directions, I would want to see the
relative strengths and weaknesses of deque vs. devector to make an informed
decision.

> For batch_deque:
> ----------------
>
> With std::deque you cannot control the segment size, and implementations
> differ widely. Is it not obvious that having a block size of 1 or 2 (as
> e.g. visual c++ does) is very fast.
>

Right. Control of the segment size is the one thing I called out in my
review as an unambiguous positive.

> That iteration over a deque can be slow has been known for two decades,
> Matt Austern wrote an article about it.

... and this should have been the other. I'm aware of that work, and I'd
like to see a deque that features configurable segment size and segmented
iteration support. As an aside, my experiments with segmented iteration
have been of mixed utility. The standard algorithms useful on segments is
limited, since they cannot see across segment boundaries. For instance,
adjacent_difference() does not work without some gymnastics.

> My second concern is that the containers in this library are difficult to
>> reason about.
>>
>
> Look at it this way: devector is like vector with O(1) front insertion;
> batch_deque is like deque with custom control over the segment size.
>
> Are you opposed to such containers or is it "all the extra" stuff (which
> you don't have to use btw) that makes you uneasy?
>

Up until now, in this post all my objections have been related to runtime
efficiency. But now you've brought back up a major complaint that I have,
which is the unsafety of the interface. To say that I don't have to use it
is a bit disingenuous to me. Adding an unsafe API weakens the invariants
of the type, making it more difficult to reason about and use. The idea
that I can add elements that will be manually destructed but that are not
yet constructed is a serious defect in my opinion. Any throwing operation
I do between the creation of such elements and their construction is now
unsafe.

> Serialization:
> --------------
>
> std types are over encapsulated which means that they can't avoid double
> initialization, nor process elements in batches. Is it not obvious that
> double initialization is more expensive than single initialization?

In the abstract, of course you cannot argue with that. I don't find it to
be a useful question, though. The questions I have are:

1) In my specific case, do I care (that is, is it measurable)? In many
cases, a type will be expensive to copy or non-default-construct, but
trivial to default construct.
2) If the answer to #1 is yes, is reserve() not sufficient?

I would expect the answer to one of those two questions to be "no" in
almost all cases. It's not that you'd *never* answer "yes" to both, it's
just that this represents a low-frequency corner.

>
> For that matter, is the performance of devector better than std::deque when
>> I
>> don't know the relative frequency of front- and back-pushes? It seems in
>> many
>> cases that it would be quite a bit worse.
>>
>
> You are comparing apples and oranges. push_back on vector may be worse
> than on deque (just try to store array<T,1024>).
>

Re-reading that. I think I didn't state my point very clearly. What I was
trying to say is that the zero-point implied in having separate front and
back capacities means that I need to guess the relative frequency of front
and back pushes if I want the benefits of reserving space. When you
combine that with the fact that growing a devector requires copying every
element, it may be a pessimization to use one over a deque. To me, that's
not an apples/oranges comparison, because my use of the entire library
boils down to the case where I know I need growth in *both* directions, and
now I must determine whether devector or deque is a better fit.

It will depend on the situation, how expensive move operations are, how
> expensive allocation is etc. But if you need or want a contiguous layout,
> you can't use deque.
>
> Another example from the docs:
>>
>> de::devector<std::string> reversed_lines;
>> reversed_lines.reserve_front(24);
>>
>
> [snip]
>
> ? If it's just to avoid the verbose for-loop, that's not enough for me.
>> If
>> it's an efficiency claim, I have no numbers to back that up, and the use
>> of
>> IO in the example makes it moot anyway.
>>
>
> It's not about rewriting a loop. It about having a different layout for
> your container's sequence because that layout is either more natural or
> convenient or simply needed.
>

Fair enough. Sometimes contiguity is very important.

> The first use case for devector is actually in the implementation of
> batch_deque. One could use some deque implementation, but that would not be
> optimal for performance.

For small objects or large? For what values of N? For what pattern of
calls? Only for double-ended growth, or for single-ended growth as well?
I don't necessarily doubt your claim, I just don't have the relative
superiority of devector quantified in any way that I can use.

> Being able to use devector there has made the code so much simpler.

It's not obvious to me why. Could you elaborate?

> I need a justification for unsafe_push_front() besides "avoiding a check,
>> thus sparing a few instructions". My intuition is that it does not
>> matter, since
>> branch prediction is quite good these days, and a single allocation will
>> swallow all the gains if there are any. However, *measurement* is the
>> only
>> thing that can make this justification, pro or con.
>>
>
> Yes, and what do you do when you actually measure and find out push_back
> is the bottleneck?
>
> vector has both operator[] and at(), and I'm pretty sure people would go
> mad if they were forced to use an operator[] that did what at() does. So
> why is push_back any different? You have basically the same situation: two
> functions that differ only slightly in their precondition.

First, please demonstrate that I should care about those few instructions.
Second, I agree that the difference is slight, but I disagree that it's ok
to make that slight change. One of the chief invariants of a standard
container is that it manages its storage. This breaks that. Dangerous
interfaces are not necessarily evil, but they certainly need to be
justified. I would need significant use cases with significant performance
benefits (not night-and-day, but significant) to tolerate this unsafety.

>
>> Again, from the example docs:
>>
>> // int sockfd = socket(...); connect(...);
>> de::devector<char> buffer(256, de::unsafe_uninitialized_tag{});
>> long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0);
>>
>> if (recvSize >= 0)
>> {
>> buffer.unsafe_uninitialized_resize_back(recvSize);
>> // process contents of buffer
>> }
>> else { /* handle error */ }
>>
>> The unsafe resize does nothing in this case, since char has a trivial
>> dtor.
>>
>
> It changes the size of the container so you can't read uninitialized
> values.
>

That's true, though some sort of span would accomplish the same thing with
no loss of safety. In the case that the unsafe uninitialized resize grows
the container, leaving storage around that is not yet constructed, but that
the dtor will destruct, is very dangerous.

If I distill all my objections down it comes to these three points, I think:

1) This library seems to cover small corner cases.
2) The API is unsafe.
3) I don't have any reason to expect that the efficiency gains in the
corner cases are substantial enough to justify the unsafety, or even that
there are positive gains in my particular corner case.

Zach

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk