
Hi, Am Samstag, den 16.09.2006, 02:22 +0200 schrieb loufoque:
Since no one has old code for reuse, I will start to write a few usable tools from scratch.
Well, I have some new code with few functionality. Feel free to contact me personally if you are interested.
Note that I am not an Unicode expert nor a C++ guru. I am just willing to work in that area and hope my code could be useful to some.
I do think that's important. As we saw lack of Unicode support is a serious problem even hindering a boost XML library.
Feel free to comment and give ideas, since I think the design is the most important thing first, especially for usage with boost, even though this topic has already been discussed a few times.
string/wstring is not really suited to contain unicode data, since of limitations of char_traits, the basic_string interface, and the dependance on locales of the string and wstring types. I think it is better to consider the string, char[], wstring and wchar_t[] types to be in the system locales and to use a separate type for unicode strings.
In the optimal case, the system is Unicode-aware. Oh well.
The aim would then be to provide an abstract unicode string type independent from C++ locales on the grapheme clusters level, while also giving access to lower levels.
Abstract? You mean like virtual foo() = 0; ? Probably better not.
It would only handle unicode in a generic way at the beginning (no locales or tailored things).
That's fine. Do you have plans on which Unicode encoding to use internally?
This string could maintain the characters in a normalized form (which means potential loss of information about singleton characters) in order to allow more efficient comparison and searching.
It would use a policy-based design in order to be as generic as possible and therefore customizable on many levels, allowing to use the data structure and encoding you need for interfacing with other libraries.
The policy-based design would also provide functionality similar to flex_string, to explicitly choose whether to use COW or other optimizations depending on the situation.
There would also be a const versions, following the const_string design.
Be careful that it remains usable.
Just like super_string, the class would bundle algorithms from string_algo, since it can probably implement them in a more efficient way than iterating over the grapheme clusters.
Oh well I don't know if I like that. You probably should concentrate on the core functionality first. You probably can add specialisations to string_algo later. Way later :-). Kind regards, Aristid Breitkreuz
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost