Boost logo

Boost :

From: Aristid Breitkreuz (aribrei_at_[hidden])
Date: 2006-09-16 06:55:47


Hi,

Am Samstag, den 16.09.2006, 02:22 +0200 schrieb loufoque:
> Since no one has old code for reuse, I will start to write a few usable
> tools from scratch.

Well, I have some new code with few functionality. Feel free to contact
me personally if you are interested.

> Note that I am not an Unicode expert nor a C++ guru.
> I am just willing to work in that area and hope my code could be useful
> to some.

I do think that's important. As we saw lack of Unicode support is a
serious problem even hindering a boost XML library.

>
> Feel free to comment and give ideas, since I think the design is the
> most important thing first, especially for usage with boost, even though
> this topic has already been discussed a few times.
>
> string/wstring is not really suited to contain unicode data, since of
> limitations of char_traits, the basic_string interface, and the
> dependance on locales of the string and wstring types.
> I think it is better to consider the string, char[], wstring and
> wchar_t[] types to be in the system locales and to use a separate type
> for unicode strings.

In the optimal case, the system is Unicode-aware. Oh well.

>
> The aim would then be to provide an abstract unicode string type
> independent from C++ locales on the grapheme clusters level, while also
> giving access to lower levels.

Abstract? You mean like virtual foo() = 0; ? Probably better not.

> It would only handle unicode in a generic way at the beginning (no
> locales or tailored things).

That's fine. Do you have plans on which Unicode encoding to use
internally?

> This string could maintain the characters in a normalized form (which
> means potential loss of information about singleton characters) in order
> to allow more efficient comparison and searching.
>
> It would use a policy-based design in order to be as generic as possible
> and therefore customizable on many levels, allowing to use the data
> structure and encoding you need for interfacing with other libraries.
>
> The policy-based design would also provide functionality similar to
> flex_string, to explicitly choose whether to use COW or other
> optimizations depending on the situation.
>
> There would also be a const versions, following the const_string design.

Be careful that it remains usable.

>
> Just like super_string, the class would bundle algorithms from
> string_algo, since it can probably implement them in a more efficient
> way than iterating over the grapheme clusters.

Oh well I don't know if I like that. You probably should concentrate on
the core functionality first. You probably can add specialisations to
string_algo later. Way later :-).

Kind regards,
Aristid Breitkreuz

>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk