|
Boost Users : |
Subject: [Boost-users] Batch job processing with Boost Spirit parser
From: Michael Levine (shmuel.levine_at_[hidden])
Date: 2014-09-04 16:27:44
I have been using Boost Sprit as a parser for a project that I have been
working on lately. At this point, I have been trying to expand the
software and, in doing so, have had the nagging feeling that there is
something wrong with my overall design. In the best case, I am not
making the best use of my tools, and in the worst case, I am concerned
that the design/code is becoming overly brittle. As an aside, I donât
have a huge amount of programming experience with this type of application.
A word of background / context :
Essentially, the program performs batch âjobsâ which are specified in a
text file (not dissimilar in that sense from a scheduler like Condor).
Spirit parses these âJob Descriptionâ files into âspecification objectsâ
(my vocabulary) that are used with a builder pattern and factories to
create the appropriate objects. I have some concerns about this which I
will describe later.
The job description (henceforth: JD) file should have 3 parts:
1. Data
2. Tools
3. Operations
The Data section is just a list of data that is to be operated on. At
this stage, this is just a vector of std::pair<std::string, std::string>
containing an identifier and the path of the file. The âToolsâ and
âOperationsâ sections are likely composed of nested specifications (the
resulting objects use either a decorator pattern or composite pattern â
depending on the type of tool) â which was the main reason that I
started using Spirit altogether.
My question is really a request for some guidance â to better utilize
the tools available:
I have just received the requirement for section #3 (Operations), to be
included in the JD file (previously it was assumed that this would be
provided in a different manner). So â at this time, I have a working
parser for the Data and Tools portions. I have concerns with the
âToolsâ portion that I would like to correct and not duplicate in the
âOperationsâ section that I am to be working on next.
At present, I am parsing the JD file mostly into strings, and vectors of
boost::variant<int,double> -- the latter being a list of parameters in a
completely arbitrarily imposed order. As Iâve previously mentioned, I
have a few problems with this approach.
- The parameters are required to be input in an arbitrary order.
- Different âToolâs and âOperationâs have different parameters that are
required and/or optional.
For the past weeks (I only work on this project on a very part-time
basis), I have been going in circles trying to figure out a better way
to do this. I have a strong suspicion that Fusion can be used for this.
I am also concerned about over-complicating the design, but am weary
of leaving the design too simplistic. I.e., I know that I can treat the
parameters as a std::pair<std::string, boost::variant<int, double>> and
then parse key-value pairs with a given delimiter. I am just not
convinced that this is the best way to do this.
I am able to use any Boost Library, and have no restrictions about
compilers (Iâm using a recent version of Clang, primarily, right now).
The following is a selection of the data structures and Spirit grammar
that I am using here:
struct Tool_Spec;
typedef std::vector<boost::variant<int, double> > Tool_Options_t;
typedef std::vector<Tool_Spec > Children_Tool_t;
typedef std::pair<std::string, std::string> Data_Spec;
struct Job_Request {
Data_Spec data_spec;
Tool_Spec model_spec;
Operation_Spec operation_spec;
boost::optional<std::string> description;
};
struct Tool_Spec{
std::string type;
std::string data_designation;
Tool_Options_t; options;
Children_Tool_t children;
boost::optional<std::string> designation;
};
struct Operation_Spec{
/*
Unknown at this time. Need help
*/
};
Datafile %= lit("@START")
>> *Data_Description
>> Tool_Description
>> Operation_Description
>> lit("@END")
;
Data_Description %= lit('%')
>> Datasource
>> lit(';')
;
Datasource =
Designator
>> lit(':')
>> Designator
;
Designator %= +(char_("0-9a-zA-Z/._") | char_('-') );
Comment_Designator %= +(char_("0-9a-zA-Z/._, ()") | char_('-'));
Tool_Description %=
Designator
>> ':'
>> ('@' >> Designator)
>> '['
>> +Options
>> ']'
>> -('{' >> *Child_Tool >> '}')
>> -qi::lexeme[Comment_Designator]
>> ';'
;
Child_Tool %=
Designator
>> ':'
>> ('@' >> Designator)
>> '['
>> +Options
>> ']'
>> -('{' >> *Child_Tool >> '}')
>> -qi::lexeme[Comment_Designator]
>> ';'
;
Options %= (int_ | double_ ) % '|';
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net