Boost logo

Boost :

Subject: Re: [boost] [strings][unicode] Proposals for Improved String Interoperability in a Unicode World
From: Beman Dawes (bdawes_at_[hidden])
Date: 2012-01-30 09:50:42


On Sat, Jan 28, 2012 at 8:12 PM, Mathias Gaunard
<mathias.gaunard_at_[hidden]> wrote:
> On 01/28/2012 05:46 PM, Beman Dawes wrote:
>>
>> Beman.github.com/string-interoperability/interop_white_paper.html
>> describes Boost components intended to ease string interoperability in
>> general and Unicode string interoperability in particular.
>>
>> These proposals are the Boost version of the TR2 proposals made in
>> N3336, Adapting Standard Library Strings and I/O to a Unicode World.
>> See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3336.html.
>>
>> I'm very interested in hearing comments about either the Boost or the
>> TR2 proposal. Are these useful additions? Is there a better way to
>> achieve the same easy interoperability goals?
>
>
> I think you should consider the points being made in N3334.

Ah, thanks! Yes, that's a very interesting proposal. I've started a
separate thread to discuss it, so won't repeat that discussion here.

>> Where is the best home for the Boost proposals? A separate library?
>> Part of some existing library?
>>
>> Are these proposals orthogonal to the need for deeper Unicode
>> functionality, such as Mathias Gaunard's Unicode components?
>
>
> It seems all you really care about is having iterator adaptors that do
> character set conversion, allowing to lazily convert any range of any
> encoding to a particular Unicode encoding.

Yes, that's a fair summary.

> This has always been the goal of my library, which somewhat provides that
> along with more advanced Unicode features. Those two things could live
> separately though.

I'm still feeling my way. I'd actually prefer to leave the encoding
conversion to someone else. It's like my POD relaxation proposal that
went into C++11 - I really didn't feel qualified to do that work, but
none of the experts stepped forward. So I got sucked into the problem.

> For standardization, the problem with iterator adaptors is that they cannot
> be as fast as free functions operating on pointers, unless the optimizer is
> pretty darn good.

Yes, but the optimizers are often "pretty darn good", and iterator
adapters are very flexible.

> The conversion algorithms are also fully template and
> cannot be put in the library binary.

That may well be correct for the general algorithms, but I'd be
surprised if specializations for the most common cases couldn't call
down to compiled binary functions.

> Those are disadvantages compared to the mechanisms that exist today in the
> standard.
>
> By the way you only have input iterator adaptors. In my library I've
> implemented bidirectional iterator adaptors and output iterator adaptors.
> You've only been considering input, but output can also be useful depending
> on the situation.

There is a do list work item to implement bidirectional iterator
adapters. And output iterator adapters are worth some work too.

Thanks for your comments,

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk