Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-04-06 06:28:11


Hi Pavol,

> I have read your proposal. Maybe I'm missing something very serious,
> but I would prefere to have a similar scheme as used by stl.
>
> So that, there will be variants accepting char and wchar_t data types,
> and all possible unicode problems will be addressed by char_traits and
> locale.

Variants of what? The command line parser and config file parser will have
two variants of the interface.

The storage component need not have two variants. What advantage will it
give? Finally, and that's the most important point, I believe that options
description component need only provide two variants for the 'value'
function. As the document say, if you have two variants of the
options_description class, than the ascii vs. unicode decision is global
for the entire application, which is not so good.

> I understand, that stl support unicode for unicode is not the best,
> but there are facilities, that can provide required functionality if
> properly extended/configured.

Let's break the question in two parts.

1. Should 'unicode support' mean that there are two versions of each
interface, one using string and the other using wstring? I think this kind
of Unicode support is not good. It means that each library which accepts or
returns strings must ultimately have double interface and be either
entirely in headers, or use instantinate two variants in sources -- which
doubles the size of the library.

2. Should program_options library use UTF-8 or wstring. As I've said,
neither is clear leader, but UTF-8 seems better.

> I think, that there is no big reason to try to reinvent a wheel and
> provide all encopasing solution in the library like program_options.
> It should be enough if it will be unicode-enabled so it can be used in the
> any specific scenario, provided that all necessary facilities are on
> place.

It's *far* from all encopassing solution. In fact, the changes in
program_options will include:

1. Adding ascii -> UTF-8 conversion in parsers
2. Adding UTF-8 -> ascii conversion in value parsers
3. Adding unicode parsers with UCS-4 -> UTF-8 conversion
4. Adding unicode value parsers and UTF8 -> UCS-4 conversion

That's all, and given that there's at least two UTF-8 codecs announced on
the mailing list, not a lof of work. And this will add Unicode support
without changing interface a bit.

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk