Boost logo

Boost :

From: Esteve Fernandez (esteve_at_[hidden])
Date: 2008-03-21 04:11:17


Hi Robin

El Viernes 21 Marzo 2008 01:35:15 Robin Redeker escribió:
> Actually JSON is not really a subset of YAML, see also the documentation
> of the Perl module of Marc Lehmann, who wrote a fully standards
> compliant JSON parser in (C, bound to Perl):
>
> http://search.cpan.org/~mlehmann/JSON-XS-2.1/XS.pm
>
> Look at the comparsion in the section 'YAML and JSON'.

Yes, you're right. JSON is not a YAML subset, strictly speaking, but the most
widely used YAML parser (Syck), accepts JSON as well. Well, the most widely
used HTML parser (Trident, IE) accepts all sorts of things that are not
HTML :-) so it's best not to try to parse YAML for the time being and stick
to a pure JSON parser.

> > - what about Unicode? I know that Boost.Regex supports Unicode if
> > compiled against ICU and the JSON spec states that everything must be in
> > Unicode (correct me if I'm wrong)
>
> Yes, the JSON spec states that JSON is Unicode text, encoded in (any)
> Unicode encoding (usually UTF-8). However, there is one hard part when
> writing a JSON parser, you have to take care to handle the \uXXXX
> literals in strings correctly. The JSON spec (RFC 4627,
> http://www.ietf.org/rfc/rfc4627.txt ) states in section 2.5:
>
> To escape an extended character that is not in the Basic
> Multilingual Plane, the character is represented as a
> twelve-character sequence, encoding the UTF-16 surrogate pair.
> So, for example, a string containing only the G clef character
> (U+1D11E) may be represented as "\uD834\uDD1E".
>
> So care must be taken not to overlook this small detail.

Thanks for pointing this. This is one of the things that worries me about the
JSON parser before I apply to GSoC, if it has to be fully compliant with the
Unicode part of the JSON spec. TinyJSON advertises itself as Unicode
compliant, but don't know if it takes into account this little bit about the
JSON spec.

> > - TinyJSON and JSON.Spirit both use a MIT-like license (JSON.Spirit is
> > licensed under CPOL). The Boost license is compatible with them but,
> > could it pose a problem? There's JSONcpp [1] as well, which is public
> > domain.
>
> I wrote a JSON parser in C++, which I didn't release yet, which should
> be almost 100% compliant with the JSON RFC. However, I defined my own
> bytebuffer class with special UTF-8 handling, as I only have to deal
> with UTF-8 in my problem domain.
>
> But I could release the code under the boost license anytime if someone
> is interested. But I guess for Boost inclusion there still has to be
> done some work.

Great! I'll surely ask you for your code :-)

Cheers.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk