Boost logo

Boost :

Subject: Re: [boost] [unordered] Buffered functions?
From: David Abrahams (dave_at_[hidden])
Date: 2008-12-28 01:25:17


on Sat Dec 27 2008, Howard Hinnant <hinnant-AT-twcny.rr.com> wrote:

> On Dec 27, 2008, at 6:57 PM, David Abrahams wrote:
>
>> One thing I've always wondered about this: in that case -- at least when
>> you're not also trying to use the small object optimization -- why not
>> always use a Pimpl idiom where an empty container (with no capacity) can
>> be represented by a single null pointer?
>
> I've simply had clients complain to me: why is sizeof(vector) 4 words instead of 3?
> Please make it 3.

I take it you mean that nobody has ever complained to you that
sizeof(vector) should have been one word.

> Given that complaint, I can attempt to generate reasonable use cases
> which motivate such a complaint. But my basic premise is that those
> complaining know their needs far better than I do.
>
> I've never actually had anyone complain to me about the sizeof an
> unordered_set. However it seems logical to me that if clients are
> concerned about the overhead of one container, they could be concerned
> about the overhead of another (and thus any) container.

By the same token, it seems logical to me that if reducing the size of a
vector to 3 words matters, reducing it to one word matters even more ;-)

> Concerning your suggestion about
> container<unique_ptr<unordered_set<A>>> as a substitute for
> container<unordered_set<A>> (or equivalent),

That's a twisty way to say it. Just to be sure we understand one
another, I'll clarify that I never suggested about asking clients to use
the unique_ptr approach. Rather, my suggestion was that the library use
unique_ptr (or equivalent) under the covers to minimize the size of
empty data structures.

> my best guess is that there may be clients with use cases where the
> median size of the inner container is small but non-zero, and thus the
> extra heap allocation and indirection ends up costing more.

I'm guessing that most of those clients either haven't had the
opportunity to make the measurement, or else they use unique_ptr (or
equivalent) explicitly to get the effect they want. Also, the lack, up
to now, of a conforming auto_ptr-sized thingy that can be stored in
containers has surely impacted the measurements people are able to
conveniently make.

> Suffice it to say that from a general purpose library point of view,
> more overhead is never better if you can get the equivalent
> functionality/performance out of less overhead. And historically, the
> design of the std::containers has given precedence to speed and space
> performance as opposed to turning basic exception safety into strong
> exception safety (and I agree with that design philosophy for a
> general purpose library).

Me too...

>>>>> It seems like you think compressed pair can only compress away one
>>>> member, but IIUC you can chain them and then derive your hash table from
>>>> that to compress as much as you like. Of course it would be nice to
>>>> have a compressed_tuple that handles it all for us, but I thikn you
>>>> _can_ do it.
>>>
>>> I have high hopes that std::tuple will become the compressed_pair of
>>> the future. Technically, and especially with variadic template
>>> language support, it is not that hard to do (tuple is a pain no matter
>>> what without variadic template language support). Standards-wise, the
>>> current C++0X draft (N2798) does not outlaw, nor mandate, the empty
>>> member optimization for tuple,
>>
>> Except that there is no "empty member optimization," right? Unless this
>> is one of the langauge changes I've failed to notice (and there are a
>> few), I think what you mean is that tuple is free to use derivation to
>> take full advantage of the EBO, right?
>>
>>> and I hope to keep it that way. Recent straw polls on the LWG have
>>> been favorable to allowing the empty member optimization for
>>> std::tuple.
>>
>> Until we get CWG to agree that the EMO exists, I doubt straw polls in
>> LWG can make much difference ;-)
>
> I used the term "empty member optimization" to refer to the library technique for
> optimization as opposed to a language optimization. Your assumption regarding my
> meaning referring to derivation under the covers is correct. As long as we're on the
> subject, I should warn off naive attempts at this optimization which look like:
>
> namespace std {
>
> template <class T, class A>
> class vector
> : private A
> {
> ...
> };
>
> } // std
>
> Such an implementation of "EMO" is error prone as it associates the namespace of the
> client-supplied A with that of the vector. I'm not preaching to Dave with this
> statement as he has a proven track record in this department with his implementation
> of boost::noncopyable
> (http://www.boost.org/doc/libs/1_37_0/libs/utility/utility.htm#Class_noncopyable
> ).

Probably you should be preaching to me. That protection was added in
2004, and although I worry about unintended ADL a lot, the technique I
used there wouldn't help anyone trying to leverage EBO. In fact overuse
of boost::noncopyable can undermine EBO for reasons pointed out in...

> I mention this warning aimed at the widest possible audience. Instead
> one should use the techniques employed by boost::compressed_pair, or
> go back to (as far as I know) the originator of this technique, Nathan
> Myers (http://www.cantrip.org/emptyopt.html ).

...Nathan's article. Nice article, Nathan! I never saw that before.

The conundrum here is that if we use that technique for tuple, then a
tuple of empty types can never be empty itself. I'm starting to get ill
just thinking about a generic framework for composing "logically empty"
types that undoes this "EBO erasure" effect when necessary.

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk