Boost logo

Boost :

From: Erik Wien (wien_at_[hidden])
Date: 2005-03-17 11:05:20


Thorsten Ottosen wrote:
> hm...the function is only going to be used by 3 different classes, right?
> If so at most 3 times the size of a virtual function solution;

No. 5 I think. UTF-8, and UTF-16 and 32 in both endians. The ones in the
platform's reversed endian would only really be used for file parsing
though, whenever we get around to that...

> v-tables fill up too; and virtual functions in a class template
> can have *large* code size impact if not all virtual functions
> are used. (So are they?)

The idea is to keep the virtual interface to a bare minimum, and let the
string class itself create it's own complex interface by combining these
virtual functions. Basically just having functions for setting, getting
and iteration in the implementation, meaning they should all be used
frequently.

> sometimes strong typesafety is good; sometimes it's not

Yep. What we need to decide on, is whether it is good more than it is
not. :)

> ok, that seems to motivate that some form of dynamic types should be there.

That's what I thought too a while ago, but I'm not that sure anymore.
I'll admit I'm no iostream wizard, but wouldn't it be possible to create
some kind of unicode_stream by making a specialization of char_traits
for unsigned ints (Unicode code points), and then create some facets (I
forget which ones, codecvt and ctype I guess) that enable these streams
to read all Unicode encoding forms from their buffer, and transcode into
a sequence of Unicode code points before returning them to the user?
This would mean that the users would not have to know what kind of
encoding is used in the file they are reading. It would be totally
transparent to them.

> It seems to me that we then need four classes
>
> utf8_string
> utf16_string
> utf32_string
> utf_string // the dynamic one

The three first ones could be created by having one template class
templated on encoding, and have it use the encoding_traits classes from
the current prototype. I have tried this before, and it works fine. The
neccessity of the last one would depend on whether the iostream
functionality I mentioned above would work or not. If it is possible, I
don't really see the need for a dynamic string class either.

- Erik


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk