Boost logo

Boost :

From: Thore Karlsen (sid_at_[hidden])
Date: 2005-08-19 09:10:55


On Fri, 19 Aug 2005 09:08:59 -0400, Rob Stewart <stewart_at_[hidden]> wrote:

>> The performance problems of requiring vector<char> or char[N] exist on
>> several levels:
>>
>> - For vector<char>, there is the initialisation of the chars to 0 on
>> construction or when you do a resize. Note that this is proportional to
>> the size of the vector, not necessarily to the amount of data
>> transferred. I have seen this have a noticable cost in a CPU-bound
>> server handling thousands of connections.

>Don't construct a vector of a given size or use resize(), then.
>Rely on reserve() instead.

Then size() will return the wrong value, and relying on capacity() is
not a good idea. If you're thinking about pushing data onto the vector
as it's read, that's also bad, because then you'd have to read into a
temporary buffer first and copy it to the vector after. (Or do multiple
resizes.)

I think it's a very bad idea to require vector<char> or a static array.
Christopher does a good job of explaining the drawbacks, and I agree
with him. I also do high performance asynchronous networking in my
server and client applications, and a library requiring vector<char> or
a static array would be completely useless to me. Most of the time I
don't have the data I want to send in a vector or in a static array, and
most of the time the amount of data is too big to send or receive a
whole buffer at a time.

>> - Requiring a copy from a native data structure into vector<char> or
>> char[N]. If I have an array of a doubles say, I should be able to send
>> it as-is to a peer that has identical architecture and compiler.
>> Avoiding unnecessary data copying is a vital part of implementing high
>> performance protocols.

>Agreed. OTOH, using swap(), *if* a user used a vector<double>
>instead of the array you mention, then vector won't add overhead.

Why would a swap be necessary?

>> I believe that adding safety is best done in layers, in
>> accordance with the don't pay for what you don't need principle:
>>
>> asio::socket::send/recv etc taking void* + size_t. These functions can
>> result in a short send or receive.
>> ^
>> |
>> asio::send_n/recv_n etc taking void* + size_t. These functions attempt
>> to transfer the all of the requested data.
>> ^
>> |
>> asio::send_?/recv_? etc. New functions that take safe types like
>> vector<char> and char[N].

>std::vector takes an allocator template and constructor
>argument. A suitable allocator type could be constructed to
>"allocate" from a user-supplied buffer and fail when that buffer
>is exhausted. The problem here is that the Standard allows all
>instances of the same type to be considered equivalent, so this
>won't work.
>
>Instead, how about a std::vector-like class that takes a
>user-defined, fixed-size block of memory?

No, that would still require a copy if the data isn't already in such a
buffer. void * (or unsigned char *, or char *, or whatever) HAS to be
there, otherwise the library is useless. Such a class could be an option
(and I would like to see it as an option), but not a requirement.

In my applications I can't afford to copy data from my internal buffers
to whatever the networking library requires. I also can't put the data
in such buffers to begin with.

-- 
Be seeing you.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk