From: Stefan Seefeld (seefeld_at_[hidden])
Date: 2007-02-27 10:04:35
Péter Szilágyi wrote:
>> No. Regex, when built with Unicode support, requires ICU for that. Boost
>> doesn't have its own Unicode stuff. (There's something in the vault,
> I ask because I was thinking about writing an XML lib for the Boost
> collection, but that requires specific Unicode string handling, which I
> don't have time to implement. So if there would be some (at least basic) UTF
> support (and of course the need for such), I would consider trying to
> implement the core XML parser (+some core extensions) as a Google Summer Of
> Code project.
Are you aware of the work that went into boost XML 'bindings' in the past ?
I submitted an XML library supporting a DOM API, as well as some xmlreader.
(The implementation was based on libxml2, as I believe it would be foolish
to attempt to reinvent that particular wheel.)
My strategy for dealing with Unicode has been to delegate that to the user,
i.e. all classes are parametrized for a (Unicode) string type. Users then
plug in their own Unicode library.
I believe this to be the only viable option, since often users want to
use only Unicode, or only XML (e.g. if it is clear that the content is
all ASCII), so there is no reason to lump both together.
It would be great to get some momentum to review all past and present ideas
and build something on top of that. I may be able to help, if you are interested.
-- ...ich hab' noch einen Koffer in Berlin...
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk