Boost logo

Boost :

From: Pavol Droba (droba_at_[hidden])
Date: 2007-02-27 15:23:30


Florin Trofin wrote:
> Hello,
>
> I was looking at the string algo library hoping that I can use it with
> my own string class and I have some questions:
>
> 1. Is the library designed to work with variable length encoding
> characters (like UTF8, UTF16). The answer seems to be no, but wanted
> to find for sure. Are there any plans to make these algorithms
> compatible with this type of sequences?

Generaly no. The preamble of the library specifies that "a string" is an
arbitrary sequence of characters. If you store variable length character
string into a char* array, it will definitely not work. Since your
container will store byte-codes, not characters. If you design a utf8
encoded string class with iterators that will iterate over real
characters then there is a good chance that the string library will be
functional.

There is an issue with c++ locales that are not designed to support this
kind of encoding, therefore some algorithms will not work, unless you
extend the locales as well.

>
> 2. Some algorithms seem to be strangely named and seem to overlap
> existing functionality in boost. For example split() and find_token()
> - why not using a boost::tokenizer? I think the library needs to be
> re-organized.

Boost tokenizer is a distinct library, not releated to string_algo
library. split and underlying find_iterator functionality provided in
the string_algo library uses different design approach that is build
on the facilities in the library.

Boost is a collection of libraries, not just one library. And there
is nothing in the spirit of Boost that prevents this kind of "concurency".

It is up to you to use the one that suits you better.

>
> 3. Internationalization support. I am not sure if these algorithms
> will work properly in all languages. For example to_upper/to_lower. I
> recall that in some languages, going from uppercase to lowercase (or
> viceversa) you go from one character to two (and viceversa). These two
> algorithms make the assumption that the correspondence is one to one.
>

String_algo library depends solely on the facilities provided by
standart c++ library. Namely locales facility. As far as I know, this
kind of conversion is not supported there.

Best regards,
Pavol


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk