Boost Users :

Date view	Thread view	Subject view	Author view

Subject: Re: [Boost-users] Interested in parsing tools
From: Ramon F Herrera (ramon_at_[hidden])
Date: 2009-09-12 15:40:12

Next message: OvermindDL1: "Re: [Boost-users] Interested in parsing tools"
Previous message: Nick Edmonds: "Re: [Boost-users] Parallel BGL status in Boost"
In reply to: OvermindDL1: "Re: [Boost-users] Is there anything wrong with the Gmane newsgroup? + Interested in parsing tools and networking code"
Next in thread: OvermindDL1: "Re: [Boost-users] Interested in parsing tools"
Reply: OvermindDL1: "Re: [Boost-users] Interested in parsing tools"
Reply: Joel de Guzman: "Re: [Boost-users] Interested in parsing tools"

OvermindDL1 wrote:
>> One of my main current interests is parsing. Trying to decide among the
>> choices:
>>
>> - Regex
>> - Spirit
>> - Xpressive
>
> Depends on what you are wanting to parse. If you want to do, say, a
> search and replace in a file, Xpressive is best, if you want to parse
> data structures and you want the absolute best speed and a completely
> unambiguous grammar, Spirit2.1 for sure. Do not bother with Regex
> itself as Xpressive can do everything Regex can, but more and better.
>

Thanks so much!, OvermindDL1...

Allow me to describe my target data. I initially had a bunch of files
with lines like this:

Variable Name = Variable Value

These are some examples:

--------------------------------------------------------------------
My Favorite Baseball Player = George Herman "Babe" Ruth

What did you do on Christmas = I rested, computed the % mortgage and
visited my brother + sister.

(the above should be in a single line)

Favorite Curse = That umpire is a #&*%!
--------------------------------------------------------------------

I quickly solved the above parsing with Regex like this:

string variable = "([A-Za-z0-9][\\w\\h\\(\\)\\-\\.,/&]*)";
char equal_sign = '=';
string value = "(.+)";
assignment = variable + equal_sign + value;

After retrieving the LHS and the RHS I store them for subsequent use in
a map<string, string> data structure.

My data, however, just became a bit more challenging. It is now divided
into blocks:

[Unique ID 1]
Variable Name = Variable Value
Variable Name = Variable Value
Variable Name = Variable Value

[Unique ID 2]
Variable Name = Variable Value
Variable Name = Variable Value
Variable Name = Variable Value

[Unique ID 3]
Variable Name = Variable Value
Variable Name = Variable Value
Variable Name = Variable Value

(etc.)

Again, I would like to store the new format in a map, using the Unique
ID as key to retrieve the block of lines underneath each ID.

At this stage, I am wondering whether to continue using true and tried
(and learned!) Regex, or get my feet wet into more powerful tools, such
as the one recommended by Overmind (Xpressive).

How does Xpressive compare with ANTLR? I am torn between them.

TIA,

-Ramon

Next message: OvermindDL1: "Re: [Boost-users] Interested in parsing tools"
Previous message: Nick Edmonds: "Re: [Boost-users] Parallel BGL status in Boost"
In reply to: OvermindDL1: "Re: [Boost-users] Is there anything wrong with the Gmane newsgroup? + Interested in parsing tools and networking code"
Next in thread: OvermindDL1: "Re: [Boost-users] Interested in parsing tools"
Reply: OvermindDL1: "Re: [Boost-users] Interested in parsing tools"
Reply: Joel de Guzman: "Re: [Boost-users] Interested in parsing tools"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net