Boost logo

Boost :

From: Stefan Seefeld (stefan_at_[hidden])
Date: 2021-03-31 20:10:42

Hi Vinicius,

allow me to jump into this discussion with some thoughts.

On 2021-03-10 2:16 p.m., Vinícius dos Santos Oliveira via Boost wrote:
> XML is an old, overengineered and hated format (and rightfully so),
> but industry adoption basically forces us to use it for
> interoperability with a few services to this day. So that's the value
> for XML here, interoperability with legacy software. It's not a value
> to be neglected.
> I also think it'd be a good project for first-time students as the
> basics of the format are really well-known and I believe in my skills
> to gradually point the student to its quirks as the project advances.

I'll give a very similar advice I shared with FFT proposals: Please
consider not to re-implement a full XML library (which is quite a
daunting task), but rather, focus on the C++ API as an *interface* that
can be layered on top of existing XML libraries.

The world already has way too many incomplete and buggy XML libraries.
Please let's not make it worse.

The approach I had taken (admittedly many years ago) consists in
defining a C++ API around one of the more popular (and efficient)
implementations at the time: libxml2 (, with
support for a DOM-like API as well as a SAX-like streaming API.

Of particular importance is that a fully functional XML API needs to
have some support for Unicode, which is sadly still quite difficult to
do in C++. My choice was to parametrize the entire API around the
character type, letting users pick their own Unicode bindings (a simple
trait-like class would be enough to bind to alternative types there).

Anyhow, my code is still online, if anyone wants to have a look:

While the libxml2 bindings work very nicely (including xpath support and
some other nice features), I never felt comfortable proposing my work in
its current form for adoption into Boost without having added at least
one other XML library backend (Xerces comes to mind), to make sure the
API itself is robust enough and doesn't accidentally leak libxml2 design



       ...ich hab' noch einen Koffer in Berlin...

Boost list run by bdawes at, gregod at, cpdaniel at, john at