|
Boost : |
From: rogeeff (rogeeff_at_[hidden])
Date: 2002-01-17 13:01:45
--- In boost_at_y..., "David A. Greene" <greened_at_e...> wrote:
> rogeeff wrote:
>
> >>used Spirit (yet). But I know from reading the Spirit mailing
list
> >>that Joel, et. al. have put in lots of thought on how to keep
things
> >>lightweight.
> >
> > What I meant is adding line #include "boost/spirit/spirit.hpp" in
> > your code immidiately produce ~600k of include files (and this
>
>
> Besides compile time, which I admit could be significant depending
> on what bits of Spirit are being used, what's the problem here?
This is by itself is a problem. Simple task of CLA parsing should not
affect significantly compile-time. Since I can't compile Spirit with
any of my compilers I can't test it but I would like to here your
numbers.
>
> > implemented. Now by default parser should be able to hanle
integer
> > type, floating point values strings, boolean values (flags) and
> > probably also some support for collection of them. I hope you
agree
> > that I do not need Spirit to parse integer value from string.
Also
>
>
> The values are not the problem. The problem is the myriad of
> command-line formats. Is there an '=' between the option name
> and value, a space, a comma, nothing? Is this a single letter
option?
> I multi-character option? One dash or two? Is any nesting
involved?
In reallity not that much. And all cases you described could be
trivially parsed by much simple means.
>
> > the framework should support an ability for user define it's own
CLA
> > class, with it's own parsing logic. And here he(user) can use
> > whatever means he prefer to implement it (tokenizer,regexp,
> > handwritten code, Spirit). But this is not a part of CLA parser
> > framework - it's user code. There are also several other points:
>
>
> But it _is_ part of the framework. I'd imagine a user would want
> his or her extensions to blend nicely with the existing tools.
> Spirit allows that.
But I do not want it to be a par of the framework. If user prefer he
could use it, but it should not be required.
>
>
> > * I was not able to find out portability report for Spirit. Since
CLA
> > parsing is very basic facility, I should be able to compile it on
> > majority of compilers.
>
>
> This _is_ an issue with Spirit. The developers are working on it.
> IMHO braindead compilers should not direct the design of libraries,
> though of course every effort should be made to support them if
> at all possible.
CLA parser is very basic facility. It *should* compile on majority of
compilers.
>
> > * I could be wrong, but Spirit seems to be static compile-time
> > facility. I.e. I can't load CLA scheme dynamicaly or read it from
> > configuration file. Also how would it distributed definitions?
>
> Spirit is not static. Dynamic grammars are possible.
Could you load rule from external file?
>
> > * Even if I do not load parser rules from external file, I still
> > could be in a situation when I do not know parsing rules at my
> > compile time, cause I am a library developer and parsing rules
are
> > provided by my users.
>
>
> Er...huh? What's the issue here?
If I got properly from your code below, Spirit support distributed
definitions. But this make me move spirit.hpp into the header file to
be able to provide an interface register_cla( rule<> ... ). Now
Spirit is inherited by my users and is not implementation detail.
>
>
> > Spirit is a parser framework. Command line/Configuration
processing
> > is a different realm with diffrent rules and priorities.
>
> No, it isn't. A sufficiently rich command-line scheme almost
>
> certainly requires more than a tokenizer. It's unfortunate that
> most people, when they hear "parser," think "compiler." I'd guess
> that 99% of parser usage is completely outside the realm of
> language translation. Unfortunately, many of those parsers are
> hand-coded and fragile.
I would assume that anything more complex than that can be provided
by simple (bur powerful) generic CLA parser will take no more then 1
persent of CLA parsing needs. Other 99% will be happy without Spirit.
>
>
> > I would assume that thare are a lot of programmers that never had
a
> > need to parse a formal grammar that complex that they would need
YACC
> > or even simply knowledge of EBNF, though I do not question it's
value.
>
>
> My experience is that most programmers aren't familiar with YACC,
> etc. and waste time writing custom parsers. Then when they discover
> the available tools they either wish they'd had them earlier or
> rewrite their software to use them.
Formal grammar parsing should be user for formal grammar parsing. You
fould very cool parsing framework that allow to substitute 20 line
parsing function with one-liner, for the price of 600k of includes,
lerning EBNF, compilation-time and probably portability. Would you
rush using it? I rather not.
> >>You missed the point. Spirit is flexible enough for many, many
> >>parsing tasks, including implementation of the command-line
parser.
> >>One need not expose the Spirit interface to the programmer. But
it
> >>makes a great deal of sense to me to use Spirit to do the actual
> >>parsing.
> >
> > I did not get it. What will provide an interface and where do you
see
> > a place for Spirit? Specifically, with example.
>
>
> Well, off the top of my head, I can imagine this (note: this is
> probably not correct Spirit-wise since I've not yet used Spirit,
> but it gives a general idea):
>
> class CommandLine {
> ...
> public:
> // Implemented with Spirit
> match parse(int argc, char **argv) const;
>
> template<class Val>
> void addOption(const std::string &name, Val &valueToSet);
This just not the case. I may not have names at all. I may not want
to set any value. How would I set what kind of argument
identification to use?
>
> // Extend the parser in new and interesting ways
> ruleTag addRule(rule<> &newOptionConstruct);
> void removeRule(ruleTag ruleToRemove);
And why would I need to rely on Spirit?
I would provide function add( argument* );
where class argument define an abstract interface any used -defined
argument should comply. Also framework define several predefined
concrete arguments: int _argument, bool_argument, string_argument and
so on.
In general I would implement CLA parser as plugable factory (see
implementation in vault area), where each argument have a factory
which know how to identify argument in (argc,argv) stream and how to
construct argument from it.
> };
>
> This is really off-the-cuff -- something better should be
>
> provided. Here the only reason the programmer needs know
> about Spirit is when invoking the addRule member.
>
> > There are several questions:
> >
> > 1. How portable it is?
> > 2. How it affect compilation time?
> > 3. How it affect code size?
>
>
> These are certainly valid questions. The only way we're going to
> answer them is to experiment.
My expiriments failed. I was not able to compile Spirit on any of my
compilers.
>
>
> > Let do not forget that this framework is supposed to fit for
majority
> > of programs from tiny test program to complex and bulk process.
>
> I was thinking about this the other day. Tiny test programs are
>
> generally the ones that require very simple option syntaxes. Larger
> programs with their many options require something more heavyweight.
> Perhaps a CLA library should provide both. To start out, though, my
> leaning woulkd be toward implementing all of it in spirit and then
> moving to a specialized "simple" CLA class if that proves necessary.
>
> >>I don't know what you mean by "arbitrary parsing." Spirit is at
> >>least as flexible as YACC (well, except for left-recursion,
> >>probably :)).
> >
> > How about error handling? what if I want to ignore an error and
> > proceed. How one-liner below would handle it?
>
>
> Don't read too much into Dan's example. Of course it doesn't
> have error reporting. That would be placed in the semantic
> actions.
So it would not be a one liner any more (at least actions should be
defined somewhere)
>
>
> >>regexp and tokenizer don't necessarily have enough power to do
> >>the job. Consider the option format we use in our software:
>
> > As I sad above you sure have to have an ability to implement your
> > own "very complex" parsing and somehow plug it into the
framework.
> > But it should not be part of the framework.
>
>
> Maybe. As long as it can be extended to accomplish what we
> need, then I guess it's not a big deal if CLA proper doesn't
> provide it. We can release our extensions as CLA++ or
> something. :)
>
> In any case, doing that sort of extension with Spirit is very,
>
> very easy, unlike with tools such as YACC. One simply subtracts
> rules, adds rules, etc.
I have nothing against implementing extentions using Spirit. But it
should be implementation detail.
>
>
>
> >>>I would assume that command-line parser still will have MUCH
more
> >>>simpler interface.
> >>
> >>And by trading off flexibility for simplicity that parser can
> >>still have the same interface but be implemented with Spirit.
>
> > Id did not say that I agree with any flexibility tradeoff. But
the
> > interface should be as simple as possible: 1. plug parsing rule,
2.
> > parse, 3. get value. Couple predefined parsing rules, like for
> > interger, string e.t.c, plus an ability to plug arbitrary user-
> > defined parsing rule. There could be variations and some
> > enhancements, but something around this (in reality you would
also
> > want the framework to support several predefined kinds of
argument
> > identification for user to choose from).
>
>
> Sounds good to me, though I don't necessarily agree that "get value"
> should be the sole interface for doing things with options. Often
> I want to execute an arbitrary piece of code when I parse an option
> (or process it later in an abstract syntax tree, etc.).
Get value and than do whatever you want to do.
>
>
> Validation is also important. The programmer should be able
>
> to specify dependencies (i.e. if this option is set, this other
> one is implied or needed), provide validator objects to check
> specified values and so forth. Some of this stuff could be
> implemented via extensions.
If I understand you properly Spirit does not support this, isn't it?
>
> The ability to plug in arbitrary parsing rules is exactly the
>
> argument for using Spirit. With tokenizer or regex this will
> be very painful.
There should not be any difference how I implemented custom parsing,
since framework would use virtual methods. Which tool to use for
implementation is user preference.
>
> >>I agree Spirit looks a little cryptic. In particular the
assignment
> >>of values is rather "magical" ("ref" should probably be named
> >>"assign_to"). But even so, as someone who has experience with
YACC
> >>but zero with Spirit, I can follow this and understand what it
means
> >>(except for the bang, which I had to look up, but it makes sense
> >>if you consider it an "|" with an empty left operand).
>
> > How many programmers are familiar with YACC and how many would
use
> > CLA parser?
>
>
> Let me rephrase the question as, "how many programmers should be
> familiar with YACC or similar tools?" The answer, of course,
> is most of them -- roughly the same set that would need a CLA
parser.
I disagree with that. CLA is much more basic facility.
>
> >>I don't see a token_iterator(char *, char *) constructor. I don't
> >>even see a "token_iterator" declared anywhere. Are you sure your
> >>example works? Am I missing something?
> >
> > Wait, wait. It is not boost tokenizer (it my simple
token_iterator I
> > am using for old Sun compiler that can't handle boost one).
>
>
> Ah, ok, I didn't follow that at all. That's important information.
> I _was_ missing something. :)
>
> > definition
> > token_iterator( const_string string_to_tokenize,
> > const_string delimeters)
> >
> > it should be pretty easy to unerstand what is written there.
>
>
> Sure. It still doesn't have the power necessary to do what I
> want a CLA parser to do.
Look, I heard abould very cool parsing framework, named Spirit. Maybe
you caould use it to *implement* you custom parsing.
>
>
> -Dave
>
>
> --
>
> "Some little people have music in them, but Fats, he was all music,
> and you know how big he was." -- James P. Johnson
Gennadiy.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk