Boost logo

Boost :

From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2019-12-04 18:10:41


On 2019-12-04 19:05, Peter Dimov via Boost wrote:
> Phil Endecott wrote:
>> Peter Dimov <pdimov_at_[hidden]> wrote:
>
>> > Storing the size (as capacity - size) in the last char for N < 256
>> will > have more impact, but I'm not sure that it too is worth the
>> added > complexity.
>>
>> Why the last char, rather than always having the size (of whatever
>> appropriate type) first?  Is the idea that this makes data() and
>> c_str() essentially no-ops?
>
> The idea here is that you win one byte by reusing the last byte of the
> storage as the size, overlapping it with the null terminator in the
> size() == N case (because capacity - size becomes 0).

I'm not sure this would actually be beneficial in terms if performance.
Ignoring the fact that size() becomes more expensive, and this is a
relatively often used function, you also have to access the tail of the
storage, which is likely on a different cache line than the beginning of
the string. It is more likely that the user will want to process the
string in the forward direction, possibly not until the end (think
comparison operators, copy/assignment, for instance). If the string is
not close to full capacity, you would only fetch the tail cache line to
get the string size.

It is for this reason placing any auxiliary members like size is
preferable before the storage array. Of course, if you prefer memory
size over speed, placing size in the last byte is preferable.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk