Boost logo

Boost :

From: David A. Greene (greened_at_[hidden])
Date: 2002-01-17 14:00:30


rogeeff wrote:

>>Besides compile time, which I admit could be significant depending
>>on what bits of Spirit are being used, what's the problem here?
>
> This is by itself is a problem. Simple task of CLA parsing should not
> affect significantly compile-time. Since I can't compile Spirit with
> any of my compilers I can't test it but I would like to here your
> numbers.

Fair enough. I don't have any numbers and it will be a while
before I can get any. I do want to do the experiment, though
if someone else presents something that would be nice.

>>The values are not the problem. The problem is the myriad of
>>command-line formats. Is there an '=' between the option name
>>and value, a space, a comma, nothing? Is this a single letter
>>option? I multi-character option? One dash or two? Is any

>>nesting involved?
>
> In reallity not that much. And all cases you described could be
> trivially parsed by much simple means.

How? Which means would those be?

>>>the framework should support an ability for user define it's own
>>>CLA class, with it's own parsing logic. And here he(user) can use
>>>whatever means he prefer to implement it (tokenizer,regexp,
>>>handwritten code, Spirit). But this is not a part of CLA parser
>>>framework - it's user code. There are also several other points:
>>
>>But it _is_ part of the framework. I'd imagine a user would want
>>his or her extensions to blend nicely with the existing tools.
>>Spirit allows that.
>
> But I do not want it to be a par of the framework. If user prefer he
> could use it, but it should not be required.

Perhaps that's a legitimate desire. I still think programmers
should know about grammars in the same way they should know about
regular expressions. We could, I suppose, provide an interface
that says, "apply this function object to all input text" -- in
effect replace the parser. The function object could make use
of CLA interfaces to parse simple option formats, etc. This is
all off the top of my head -- something better should be designed.

>>>* I was not able to find out portability report for Spirit. Since
>>>CLA parsing is very basic facility, I should be able to compile it on
>>>majority of compilers.
>>>
>>
>>This _is_ an issue with Spirit. The developers are working on it.
>>IMHO braindead compilers should not direct the design of libraries,
>>though of course every effort should be made to support them if
>>at all possible.
>
> CLA parser is very basic facility. It *should* compile on majority of
> compilers.

Joel responded to your question with a list of compilers -- I wasn't
aware that Spirit worked on all of them. Keep in mind that Spirit is
still in development. Many Boost libraries have needed tweaks
to support various compilers before they were accepted into Boost.
Spirit and CLA are no exceptions.

>>>* I could be wrong, but Spirit seems to be static compile-time
>>>facility. I.e. I can't load CLA scheme dynamicaly or read it from
>>>configuration file. Also how would it distributed definitions?
>>>
>>Spirit is not static. Dynamic grammars are possible.
>
> Could you load rule from external file?

What do you mean by "external file?" At the most general level
Spirit can provide a framework to parse command-line specification
languages and dynamically create a parser on-the-fly. All you
need is a parser to translate from the CLA specification to
a Spirit parser that implements that specification -- in effect
a CLA->Spirit translator where the result is a run-time Spirit
parser object.

>>>* Even if I do not load parser rules from external file, I still
>>>could be in a situation when I do not know parsing rules at my
>>>compile time, cause I am a library developer and parsing rules
>>>are provided by my users.
>>>
>>Er...huh? What's the issue here?
>>
> If I got properly from your code below, Spirit support distributed
> definitions. But this make me move spirit.hpp into the header file to
> be able to provide an interface register_cla( rule<> ... ). Now
> Spirit is inherited by my users and is not implementation detail.

True, but I presented only one possible interface. Joel presented
the opposite extreme: a custom CLA specification language. These
sorts of things need to be fleshed out during the design stage.
This discussion is ther first step. I'm glad you're pointing out
these important issues.

>>No, it isn't. A sufficiently rich command-line scheme almost
>>certainly requires more than a tokenizer. It's unfortunate that
>>most people, when they hear "parser," think "compiler." I'd guess
>>that 99% of parser usage is completely outside the realm of
>>language translation. Unfortunately, many of those parsers are

>>hand-coded and fragile.
>>
> I would assume that anything more complex than that can be provided
> by simple (bur powerful) generic CLA parser will take no more then 1
> persent of CLA parsing needs. Other 99% will be happy without Spirit.

Sorry, I didn't follow you here. What is 1% of CLA parsing needs?

>>My experience is that most programmers aren't familiar with YACC,
>>etc. and waste time writing custom parsers. Then when they discover
>>the available tools they either wish they'd had them earlier or
>>rewrite their software to use them.
>
> Formal grammar parsing should be user for formal grammar parsing. You
> fould very cool parsing framework that allow to substitute 20 line
> parsing function with one-liner, for the price of 600k of includes,
> lerning EBNF, compilation-time and probably portability. Would you
> rush using it? I rather not.

The portability issue is a non-issue since Spirit uses std C++. There
may be some "braindead compiler" issues, but that remains to be seen.

I'd trade 20 lines of potentially buggy hand-coded software for
one line of specification code that requires familiarity with
necessary computer science knowledge in a second. Programmer time
is the most valuable time.

>>Well, off the top of my head, I can imagine this (note: this is
>>probably not correct Spirit-wise since I've not yet used Spirit,
>>but it gives a general idea):
>>
>>class CommandLine {
>>...
>>public:
>> // Implemented with Spirit
>> match parse(int argc, char **argv) const;
>>
>> template<class Val>
>> void addOption(const std::string &name, Val &valueToSet);
>>
>
> This just not the case. I may not have names at all. I may not want
> to set any value. How would I set what kind of argument
> identification to use?

You have interfaces for passing validators and/or semantic action
functors. I presented an interface for the single most common CLA
task. Extension is a matter of simple extrapolation.

I'm not sure what you mean by "argument identification."

>> // Extend the parser in new and interesting ways
>> ruleTag addRule(rule<> &newOptionConstruct);
>> void removeRule(ruleTag ruleToRemove);
>
> And why would I need to rely on Spirit?
> I would provide function add( argument* );

> where class argument define an abstract interface any used -defined
> argument should comply.

This is a grammar.

> In general I would implement CLA parser as plugable factory (see
> implementation in vault area), where each argument have a factory
> which know how to identify argument in (argc,argv) stream and how to
> construct argument from it.

This is a parser with semantic actions.

Spirit combines the two to automate the coding process.

>>These are certainly valid questions. The only way we're going to
>>answer them is to experiment.
>
> My expiriments failed. I was not able to compile Spirit on any of my
> compilers.

Please let the Spirit team know about your problems. They are
very interested in hearing about users experiences.

 
>>Don't read too much into Dan's example. Of course it doesn't
>>have error reporting. That would be placed in the semantic
>>actions.
>>
> So it would not be a one liner any more (at least actions should be
> defined somewhere)

Well, they could be embeeded into the grammar, keeping it a
single statement, but I think the clearer way is to use
function objects. Your tokenizer example would have to do the
same.

 
>>In any case, doing that sort of extension with Spirit is very,
>>very easy, unlike with tools such as YACC. One simply subtracts
>>rules, adds rules, etc.
>
> I have nothing against implementing extentions using Spirit. But it
> should be implementation detail.

I think people are arguing is that the CLA parser should use
Spirit to provide flexibility and reduce programmer burden. The
interfaces to the programmer are yet-to-be determined. I think
one of them should expose a (possibly limited) Spirit interface.

>>Sounds good to me, though I don't necessarily agree that "get value"
>>should be the sole interface for doing things with options. Often
>>I want to execute an arbitrary piece of code when I parse an option
>>(or process it later in an abstract syntax tree, etc.).
>>
> Get value and than do whatever you want to do.

Why should I have to do any interacting to check whether a value
is there or what value it is? The parser already knows all of
this stuff. The value should be put into the variable I specified
or a function object should be invoked with the value as an
argument or something else should be done. I should not have
to separately check whether the value is there -- that's what
the parser is for!

>>Validation is also important. The programmer should be able
>>to specify dependencies (i.e. if this option is set, this other
>>one is implied or needed), provide validator objects to check
>>specified values and so forth. Some of this stuff could be
>>implemented via extensions.
>
> If I understand you properly Spirit does not support this, isn't it?

Spirit is a parser generator. It is up to the programmer to
provide the semantic actions. This is where attribute propagation
(which Spirit supports very nicely) may come into play.

Alternatively, one could define a dependecy specification language
and have Spirit process a programmer-provided dependency
specification to automatically generate a graph and check the
parsed CLA input against it.

>>The ability to plug in arbitrary parsing rules is exactly the
>>argument for using Spirit. With tokenizer or regex this will
>>be very painful.
>
> There should not be any difference how I implemented custom parsing,
> since framework would use virtual methods. Which tool to use for
> implementation is user preference.

We could design interfaces that way. I need more information and
experience to decide what the right interfaces are.

>>>How many programmers are familiar with YACC and how many would
>>>use CLA parser?
>>
>>Let me rephrase the question as, "how many programmers should be
>>familiar with YACC or similar tools?" The answer, of course,
>>is most of them -- roughly the same set that would need a CLA
>>parser.
>
> I disagree with that. CLA is much more basic facility.

It can be, but is often not. I've got a big chunk of software
that demonstrates this. In any event, grammars/parsing/sytax-directed
translation are fundamental computer science concepts with which every
programmer should at least be familiar.

>>Sure. It still doesn't have the power necessary to do what I
>>want a CLA parser to do.
>
> Look, I heard abould very cool parsing framework, named Spirit. Maybe
> you caould use it to *implement* you custom parsing.

I can't count the number of times I've wished Spirit was available
when I wrote that thing. :)

                              -Dave

-- 
"Some little people have music in them, but Fats, he was all music,
  and you know how big he was."  --  James P. Johnson

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk