|
Boost Users : |
From: Pablo Aguilar (pablo.aguilar_at_[hidden])
Date: 2005-06-02 16:24:10
This sounds like a job for something like Spirit
(http://www.boost.org/libs/spirit/), rather than tokenizer...
When trying to implement this for tokenizer, you'll likely be duplicating
stuff already done for you by Spirit.
Pablo
"Dennis Jones" <djones_at_[hidden]> wrote in message
news:d7nr4q$mt3$1_at_sea.gmane.org...
> Hi,
>
> I'm using the tokenizer class to allow users of my program to concatenate
> fields of data into a resultant string, where each field can be a quoted
> string literal, or some pre-defined entity that gets substituted by the
> program at some point later. The + symbol is treated like a concatenation
> operator. For example, a user might enter a string like this (including
> the
> quotes):
>
> "hello," + " world"
>
> In this case, my program would concatenate the two string literals
> ("hello," and " world") together so that the result is "hello, world"
> (note
> that these quotes are not actually part of the result string). My basic
> tokenizer usage is below:
>
> // FieldSpec is the incoming string as entered by the
> // user, including quotes to denote string literals
> std::string str = FieldSpec.c_str();
>
> typedef boost::tokenizer<boost::char_separator<char> > tokenizer;
> boost::char_separator<char> fieldSeparator("+", "",
> boost::keep_empty_tokens);
> tokenizer fieldTokens(str, fieldSeparator);
> for ( tokenizer::iterator tok_iter = fieldTokens.begin();
> tok_iter != fieldTokens.end();
> ++tok_iter )
> {
> // do something with the token
> // (could be a string literal or a pre-defined entity)
> }
>
> The problem I have is that the user might wish to include plus signs
> in his string lterals, as in this example:
>
> "1" + " + " + "2 = 3"
>
> Here, the user has entered a " + " which should indicate a literal plus
> sign
> as opposed to a concatenation operator. The obvious desired result would
> be:
>
> "1 + 2 = 3" (minus the quotes)
>
> My current usage of tokenizer does not handle this at all, as it has no
> regard for _where_ the '+' symbols are located in the user's string; that
> is, it doesn't care if they are within quotes or not.
>
> I would like my tokenizer usage to be smart enough to know the difference
> between _real_ token separators and those that might exist as string
> literals within quotes. Can I use the tokenizer class to do this, or do I
> need to use some other method to tokenize my strings?
>
> I see something about the concept of a TokenizerFunction in the
> documentation, but I don't really have any idea how
> to implement one, or if it would even be helpful in this situation. I'm
> rather new to the boost libraries and template usage in general, so all
> help and suggestions are welcome.
>
> Thanks,
>
> - Dennis
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net