Boost logo

Boost :

From: Rogier van Dalen (rogiervd_at_[hidden])
Date: 2004-10-23 06:41:31


On Fri, 22 Oct 2004 14:49:46 -0400, Beman Dawes <bdawes_at_[hidden]> wrote:
> At 01:10 PM 10/22/2004, Miro Jurisic wrote:
>
> >boost::fs, as far as I understand it, ran into the problem that it was
> >impossible to sidestep the invariant.
>
> No, rather than the error check was on by default. Some people want it off
> as the default.
>
> As far as Unicode strings are concerned, the question is a little
> different. Is it well defined behavior to create a string that does not
> meet the Unicode invariants? If so, can ordinary operations break
> invariants, or is such dangerous activity restricted to "experts only"
> functions?

I guess you could say that all ordinary operations may take place on
three different levels. Appending a code unit to the sequence of code
units may make it uninterpretable as a codepoint sequence. Appending a
codepoint may make it uninterpretable as a sequence of characters. The
problem I think is not the operations, but rather the level they
operate on.

I have not yet found examples where a non-const code unit or codepoint
sequence is needed, except for input. I think initialising from a code
unit sequence (say, a UTF-8 encoded file) from two iterators, as shown
by Peter Dimov, would be just right.

(You can always make your own UTF-8 sequence and put it into a Unicode
string, of course, but this will probably mean copying the data.)

(For output, non-mutating access to the code units may be provided,
for example to UTF-16 code units if you want to interface with Win32
API functions.)

Rogier


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk