Boost logo

Boost Users :

Subject: [Boost-users] [Spirit] Looking for a little Qi guidance for Unicode parsing
From: Michael Powell (mwpowellhtx_at_[hidden])
Date: 2019-01-27 18:05:14


Hello,

I am turning a corner in my JSON parser. I support ASCII through and
through, but now I want to support Unicode, apparently UTF-8, part of
the JSON standard. From what I can tell, this is not the entire
grammar, but just for Strings.

Looking for a little guidance on how to approach that issue, the
elements involved, etc. Such as, are we talking about C++
std::wstring? I have also seen std::u32string referenced in some
forums.

To begin with, it is a somewhat naive impression, would the characters
not translate to unsigned char or char, but rather to
std::wstring::value_type or std::u32string::value_type? Things like
that come to mind approaching the issue.

Additionally, how to otherwise handle symbol tables such as escape
characters, i.e. from:

struct escapes_t : qi::symbols<char, char> {
    escapes_t() {
        this->add("\\b", '\b')
            ("\\f", '\f')
            ("\\n", '\n')
            ("\\r", '\r')
            ("\\t", '\t')
            ("\\v", '\v')
            ("\\\\", '\\')
            ("\\/", '/')
            ("\\'", '\'')
            ("\\\"", '"')
            ;
    }
} char_esc;

And on from there.

Thanks!

Best regards,

Michael W Powell


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net