Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2008-01-09 20:25:42


Sebastian Redl wrote:
> Phil Endecott wrote:
>> For a UTF-8 string, my proposal offered
>>
>> a mutable random-access byte iterator
>>
> What is the use case for this?

It's for when you want to treat the data as a sequence of bytes. For
example, another thread at the moment is discussing base64 encoding.
The input to a base64 encoder could be a byte stream iterator.

There are also cases where you can exploit knowledge about the encoding
to use a byte iterator in place of a character iterator. Specifically,
in UTF-8 all bytes after the first of a multi-byte character are
>=128. So in a parser, I might want to skip forward to the next '"',
or '<' or whatever; since those are both <128, I can do this
significantly more efficiently using the byte iterator.

>> Concerning mutable vs. immutable strings: which is best in any
>> particular case clearly depends on the size of the string, the
>> operation being performed, and whether it has a variable-length
>> encoding. The programmer should be allowed to choose which to use.
>> (An interesting case is where the size or character set changes at
>> run-time, and a run-time choice of algorithm is appropriate.)
>>
> Why on earth would you change the character set of a string at runtime?

I should have written "where the size or character set _varies_ at run-time".

Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk