Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2019-12-05 15:32:10


Dear All,

Considering small-capacity fixed_strings like the example
that I gave previously:

struct NameAndAddress {
  static_string<14> firstname;
  static_string<14> surname;
  static_string<30> address_line_1;
  static_string<22> city;
  static_string<10> postcode;
};

I have been looking at how best to copy and compare them.
For small-capacity strings, and also for large-capacity
strings when their size approaches the capacity, it will
be quicker to unconditionally copy/compare everything,
including the unused bytes beyond the end, rather than
looping over just the current size.

I've tried a very crude benchmark based on this:

struct string {
  uint8_t size;
  std::array<char,N> data;
};

extern void copy_all(string& dest, const string& src)
{
  dest = src;
}

extern void copy_size(string& dest, const string& src)
{
  dest.size = src.size;
  std::copy(src.data.begin(), src.data.begin()+src.size, dest.data.begin());
}

Results will depend on the architecture but I found that for
N < 64 it was probably always best to copy_all(). For N==255,
time for copy_all() and copy_size() is about equal when the
container is half full.

Things are more complicated for operator== because of the
unused bytes. It would be necessary to initialise everything
to 0 and maintain that during e.g. erase() and assign(). I
haven't tried to benchmark that but I suspect the trade-off
would be similar.

operator< is a mess due to endianness. I don't know whether
any compilers or memcmp() implementations are smart enough to
use byte-swap instructions to process data 4 or 8 bytes at a
time.

I don't think I'm suggesting that this should necessarily be
implemented in fixed_string, at least not without lots more
investigation, but it does illustrate more of the trade-offs
that exist between different applications.

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk