Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2005-01-07 03:28:18


Thorsten Ottosen wrote:
> "Jonathan Turkanis" <technews_at_[hidden]> wrote:

>>> 4. Thorsten asks why the widening and narrowing functions shouldn't
>>> be non-member functions. One answer is that code conversion can be
>>> (slightly) more efficient if a large buffer is used. Making the core
>>> conversion functions member functions allows buffers to be used for
>>> several string conversions.

>> I think the added flexibility of the overloads taking iterators is
>> more significant than the ability to buffer.
>
> I can't really figure out how this buffering should work. Buffering
> of what?

Look at the interface of std::codecvt, e.g. at the member function in. This
function takes a Byte array as input and write wide characters to a second Byte
array. Since in() is a virtual function, it's slightly faster to call it once or
twice per string than to call it once for each character in a string. Similar
remarks hold for out. To make this work, you need

(i) the input to be presented as a character array (std::string or const char*
is fine, but a pair of forward iterators isn't)
(ii) a good sized buffer for output

The example I presented doesn't satisfy (i). But in the absense of performance
data I won't worry about it.

> I agree that there should be iterator versions underneith the range
> interface
> (that's how it all works.)
>
> Jonathan:
>> template<typename WideStr>
>> basic_string<typename Codecvt::extern_type>
>> narrow(const WideStr& str)
>> {
>> string_converter<> cvt;
>> return cvt::narrow(str);
>> }
>
> The could be specified as
>
> template< class NarrowString, class ReadableWideForwardRange, class
> Codecvt > NarrowString narrow( const ReadableWideForwardRange& r,
> const Codecvt& cc ); ...

First, as much as I love Boost.Range and would like to see it standardized, I
don't think the code conversion proposal should use it.

The second problem is that it's awkard to specify the codecvt instance when you
just want a codecvt to be grabbed from the globale locae:

   std::locale loc = locale::global();
   const std::codecvt<wchar_t, char, std::mbstate_t>& cvt =
      std::use_facet< std::codecvt<wchar_t, char, std::mbstate_t> >(loc);
   // etc.
   std::string s = narrow(ws, cvt);

Most people will want to be able to write

   std::string s = narrow(ws);

Perhaps a good solution would be to have overloads, in addition to the ones I
showed, with a signature including a codecvt instance, as in your example.

Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk