From: Aristid Breitkreuz (aribrei_at_[hidden])
Date: 2006-09-16 06:55:47
Am Samstag, den 16.09.2006, 02:22 +0200 schrieb loufoque:
> Since no one has old code for reuse, I will start to write a few usable
> tools from scratch.
Well, I have some new code with few functionality. Feel free to contact
me personally if you are interested.
> Note that I am not an Unicode expert nor a C++ guru.
> I am just willing to work in that area and hope my code could be useful
> to some.
I do think that's important. As we saw lack of Unicode support is a
serious problem even hindering a boost XML library.
> Feel free to comment and give ideas, since I think the design is the
> most important thing first, especially for usage with boost, even though
> this topic has already been discussed a few times.
> string/wstring is not really suited to contain unicode data, since of
> limitations of char_traits, the basic_string interface, and the
> dependance on locales of the string and wstring types.
> I think it is better to consider the string, char, wstring and
> wchar_t types to be in the system locales and to use a separate type
> for unicode strings.
In the optimal case, the system is Unicode-aware. Oh well.
> The aim would then be to provide an abstract unicode string type
> independent from C++ locales on the grapheme clusters level, while also
> giving access to lower levels.
Abstract? You mean like virtual foo() = 0; ? Probably better not.
> It would only handle unicode in a generic way at the beginning (no
> locales or tailored things).
That's fine. Do you have plans on which Unicode encoding to use
> This string could maintain the characters in a normalized form (which
> means potential loss of information about singleton characters) in order
> to allow more efficient comparison and searching.
> It would use a policy-based design in order to be as generic as possible
> and therefore customizable on many levels, allowing to use the data
> structure and encoding you need for interfacing with other libraries.
> The policy-based design would also provide functionality similar to
> flex_string, to explicitly choose whether to use COW or other
> optimizations depending on the situation.
> There would also be a const versions, following the const_string design.
Be careful that it remains usable.
> Just like super_string, the class would bundle algorithms from
> string_algo, since it can probably implement them in a more efficient
> way than iterating over the grapheme clusters.
Oh well I don't know if I like that. You probably should concentrate on
the core functionality first. You probably can add specialisations to
string_algo later. Way later :-).
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk