Boost logo

Boost :

Subject: Re: [boost] C++ Networking Library Release 0.5
From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2010-02-02 08:13:27

> > I'm not sure what you mean by "the straightforward way implies
> creating
> a
> > string". String for what? The attribute? The input? Spirit does not
> > allocate
> > any memory. Also, AFAIK, you can avoid using strings. Perhaps be more
> > specific?
> Let's say I have an input of const char *. My input consists of a
> command
> with up to two parameters. I used Spirit to parse this command and
> decompose it into chunks (which means that the result of the parsing
> will
> be one enum and two "strings").
> Currently it's a simple loop which returns the pointers to the chunks.
> Very fast (obviously).
> >From what I understood of Spirit, I wrote a parser which created a
> string
> when it found my chunk. On top of my head it might have looked like
> this
> lit("command1") >> +char_[ref(my_str) = _1] >> lit("separator") >>
> +char_[ref(my_str2) = _1] |
> lit("command2) >> +alnum_[ref(my_str) = _1]

    typedef iterator_range<char const*> range_type;
    typedef std::pair<range_type, range_type> result_type;
    rule<char const*, result_type()> r =
        "command1" >> raw[+(!lit("separator") >> char_)] >> "separator" >>
        "command2" >> raw[+alnum];

Does exactly as the above except it returns two pairs of pointers to the
arguments of your commands. Just call it as:

    char const* begin = ...;
    char const* end = ...;
    result_type rt;
    parse(begin, end, r, rt);

allowing you to access your pairs of pointers from rt.
Voila! No memory allocation at all!

> This is really cool and much easier to understand than the current
> loop.
> Currently the memory allocation occurs when putting the input into the
> string. I now realize I can replace ref(my_str) = _1 with something
> that's
> going to build a pair of pointers based on the input which should
> reduce
> the gap between Spirit and the custom parser.
> But then I encountered a different problem which is that the 'grammar'
> cannot be read from left to right. Basically the input may contain the
> separator, so what I currently do is read my command, start from the
> right,
> when I reach the separator I have my second token and the rest is the
> first
> token. That can be avoided as well in offloading this part off Spirit.

I think this is solved by the above rule.

Regards Hartmut

Meet me at BoostCon

Boost list run by bdawes at, gregod at, cpdaniel at, john at