Boost :

Date view	Thread view	Subject view	Author view

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2008-02-19 07:40:28

Next message: shunsuke: "Re: [boost] [Iterator] possible zip_iterator problem (was:[Fwd: Re: Pairing ptr_containers with zip iterators])"
Previous message: Paul A Bristow: "Re: [boost] Ann: Floating Point Utilities Review starts today"
In reply to: Phil Endecott: "Re: [boost] UTF-8 conversion setc. [was: [String algorithm] is_any_of has inefficient implementation]"
Next in thread: Frank Mori Hess: "Re: [boost] UTF-8 conversion etc."
Reply: Frank Mori Hess: "Re: [boost] UTF-8 conversion etc."
Reply: Sebastian Redl: "Re: [boost] UTF-8 conversion etc."
Reply: Sebastian Redl: "Re: [boost] UTF-8 conversion etc."
Maybe reply: Graham: "Re: [boost] UTF-8 conversion etc."
Maybe reply: Graham: "Re: [boost] UTF-8 conversion etc."
Reply: Sebastian Redl: "Re: [boost] UTF-8 conversion etc."

Phil Endecott wrote:
> Felipe Magno de Almeida wrote:
>> On Fri, Feb 15, 2008 at 3:54 PM, Phil Endecott wrote:
>>> This week I
>>> have been writing some UTF-8 encoding and decoding and
>>> Unicode<->iso8859 conversion algorithms. They seem to be faster than
>>> the libc implementations which is satisfying especially as I haven't
>>> even started on the serious optimisations yet. This will be part of
>>> the strings-tagged-with-character-sets stuff that I have described
>>> before. Anyone interested?
>>
>> Sure. Though I'm most interested in all charset conversions. But the
>> most usual is enough to speed up my application *a lot*.
>
> Thanks to everyone who expressed an interest.
>
> I will attempt to have some sort of documentation and code available in
> the next few days. Pester me if I don't produce anything.

OK, the code is here:
http://svn.chezphil.org/libpbe/trunk/include/charset/

and there are some very basic docs here:
http://svn.chezphil.org/libpbe/trunk/doc/charsets/
(Have a look at intro.txt for the feature list.)

This code is not yet Boostified (namespaces, directory layout etc.)
Most of it compiles but it has hardly been exercised at all.
The functionality includes conversion between UTF-8, UCS-2, UCS-4,
ASCII and ISO-8859-*.

Things I'd appreciate feedback on:
- What should the cs_string look like? Basically everywhere that
std::string uses an integer position I have the choice of a character
position, a unit position, or an iterator - or not providing that function.
- What character sets are people interested in using (a) at the "edges"
of their programs, and (b) in the "core"?

Regards, Phil.

Next message: shunsuke: "Re: [boost] [Iterator] possible zip_iterator problem (was:[Fwd: Re: Pairing ptr_containers with zip iterators])"
Previous message: Paul A Bristow: "Re: [boost] Ann: Floating Point Utilities Review starts today"
In reply to: Phil Endecott: "Re: [boost] UTF-8 conversion setc. [was: [String algorithm] is_any_of has inefficient implementation]"
Next in thread: Frank Mori Hess: "Re: [boost] UTF-8 conversion etc."
Reply: Frank Mori Hess: "Re: [boost] UTF-8 conversion etc."
Reply: Sebastian Redl: "Re: [boost] UTF-8 conversion etc."
Reply: Sebastian Redl: "Re: [boost] UTF-8 conversion etc."
Maybe reply: Graham: "Re: [boost] UTF-8 conversion etc."
Maybe reply: Graham: "Re: [boost] UTF-8 conversion etc."
Reply: Sebastian Redl: "Re: [boost] UTF-8 conversion etc."

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk