Boost logo

Boost :

From: Pavol Droba (droba_at_[hidden])
Date: 2004-04-06 10:54:17


Hi,

On Tue, Apr 06, 2004 at 06:29:54PM +0400, Vladimir Prus wrote:

[snip]
 
> > This argument is quite questionable. IMHO either you stick with narrow, or
> > wide characters in whoule application. Otherwise you are forced to make
> > conversions on the border lines. I don't realy see a point in the mixed
> > type approach.
>
> Ok, let me rephrase. You're writing boost::http_proxy library and want to
> make it customizable via program_options. So you need to provide function
> 'get_options_descriptions'. What will the function return? If there's only
> one options_descriptions class, there's no question. If there are two
> versions, then which one do you return? No matter what you decide, the main
> application might need to do conversions just because it either needs
> unicode or does not need it.

Well, the http library have two options. Either it can be char_type independent
or it would simply accept only char* variants. Given the case of http library,
later will be probably the case because it is quite domains specific library.

I see that we are generaly arguing, whether program_options library domain
is generic enough to support natively char and wchar_t (and be templated) or
if it is enough to provide an interface via conversions and support only
one encoding internaly.

I'm in favor of the first approach.

The library works with various sources of informations and its purpose is to
restructure the information from these sources into something more usable. I
would assume for such a utility, that information passed on input has the same
encoding and format as the information on the output. From the nature of the
library it seems, that it might be possible to avoid unnecessary conversion into some
intermediate encoding.

Another association might be a container. The library is a kind container. It
parses the input and provides a conainer-like interface for the information stored
there. I find it natural, that the container uses the same encoding for its
internals as it provides in the external interface.

> And why an existing operator>> which works for istream only should be fixed
> to support wistream, if some other option need unicode support?
>

I don't understand this point.

[snip]
 
> I generally tend to ignore speed issues, since with linear time algorithsm
> and contemporary processors it's not likely to be important. OTOH, code
> size *is* important. I've just compiled one of the library example, with
> static linking and full optimization. It takes 152K.
>
> Probably, it's partly gcc fault, or maybe it can be reduced but now it's so.
> Empty program takes several K. Now, if I tell anyone "here's a good library
> for parsing command line but it will add 152K to the application size", the
> someone will tell "thanks, I'll parse command line by hand".
>
> However, is the library is shared and is available on every Linux
> installation, then the code size is not issue.
>

I don't think, that overhead of 152kb is somehow too big. We living in the world
of GBs, few kBs does not realy change much. If an application would use some
STL stuff, it won't very small anyway.
(probably not the best example, but I have compiled following program with gcc3.3.1
in cygwin with options -03, and stripped of debug info afterwards

#include <iostream>

using namespace std;

int main()
{
   cout << "a test" << endl;
   return 0;
}

resulting binary have 200Kb)

I would strongly prefer simplier usage of the library to an overhead of 152kBs.
 
> > If my application is unicode, and all input I have is unicode, it is realy
> > annoying to convert everything to and fro when interfacing to library like
> > program_options.
>
> You don't have to convert anything. Parsers will accept wstring and for
> values where you need unicode you'll use wstring as well.
>

[snip]
 
> Some of the conversions are unavoidable. E.g. if you have unicode-enabled
> library, you'd still need to accept ascii input (because you can't expect
> that all input sources are unicode -- main in Linux is never unicode).
>
> If you want to support legacy operator>> you'd need conversion to ascii.

I'm not a linux expert. I'm mainly working on windows. If I decide to use unicode,
I have whole api in the unicode without any need for conversions.

Actualy in the project I'm working on now, I encountered a need for conversion
only once. I'm using date_time library and there was no support for the wide
strings at the time. Fortuntely it is fixed now :)

Regards,

Pavol


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk