Boost logo

Boost :

From: David A. Greene (greened_at_[hidden])
Date: 2002-01-17 13:13:56


Schoenborn, Oliver wrote:

>>Nope. Spirit is targetted at any C++ programmer who needs to do
>>parsing, which is most of them. Even more so now in the age of the
>>internet and all it's protocols which need to be parsed.
>
> Double Nope, ie yep yep. Spirit is targetted at those who need a grammar,
> not those who need parsing. The simplest kind of parsing has such a simple
> grammar that you can hardly say it has a grammar.

Of course there are cases where a parser generator is not needed.
Anything easily done with cut(1) doesn't need a formally specified
grammar.

But there are many, many tasks where grammars are useful. IMHO
Spirit should absolutely go into Boost provided all of the
issues that come up during review are resolved. It's a real shame
that more programmers don't use parser generators. Lots of time
is being wasted out there.

>>If a programmer doesn't know EBNF, then they are missing an important
>>piece of knowledge, since it's the standard for computer langauge
>>definition.
>
> If a programmer doesn't know Lisp, Scheme, Prolog, Ada, Perl they are also
> missing important pieces of knowledge, because every computer language,
> whether it is for programming or parsing, has something new to teach. That
> doesn't mean everyone ought to know EBNF. From reading the posts of people
> who are familiar with Yacc, it seems that yacc-familiar people forget how
> cryptic yacc is, and hence that in reality, few programmers are familiar
> with it.

YACC is only cryptic to those who don't know about grammars and
syntax-directed translation, which I sadly admit is the great
majority of programmers. More programmers should know this stuff.
Don't focus too much on EBNF. Programmers should learn the
concepts. A grammar is an important computer science concept
in the same way functional programming is an important computer
science concept. The concrete language is secondary.

> Same perhaps with EBNF. If I don't need a grammar, why would I bother
> learning how to formally describe a grammar?

Because you might need one later.

> A big problem in software maintenance is that code uses tools too

> complex for the average programmer.

In my experience the bigger problem is that such tools are often
custom and proprietary.

> When do I really *need* complex grammar? Only when I need to express
> complex relationships. Yet software should keep relationships as simple as
> possible. Clearly, if you're building a compiler, you have no choice, you
> *need* a complex grammar: the complexity of the grammar will allow for more
> complex problems to be solved. But the average information parsing, this is
> complete overkill, see below.

What do you define as "average?" It turns out that command-line
processing is not particularly easy. Witness all of the discussion
about what formats should be supported, allowances for parser
extension, etc. Software relationships should be as simple as
possible, but no simpler.

>>Spirit is aimed to be a general parsing framework. It is not aimed to
>>be a simple command line parser.
>
> Hmm, most other posts argued otherwise.

No, many other posts argues that Spirit's flexibility makes it
a good candidate for command-line processing. Genericity does
not preclude application to a specialized task.

> When are you really going to *need* such freedom of syntax? Certainly not
> in parsing a command line

I think the current discussion has shown otherwise.

> internet. In all such cases, the rule is simplicity, because simplicity
> means usability

Simplicity means usability but that is not mutually exclusive with
flexibility.

> programmer in the third. Especially the third. Data over the internet has
> presumably already been parsed and for minimal bandwidth, should be encoded
> as compactly as possibly, and not require reparsing on the other side.

Huh? Those requirements seems contradictory to me. Something encoded
for compactness is going to need something on the other end to decode
and expand it. gzip is an extreme example. Network protocol headers
are not particularly compact because routers, etc. need to process
them quickly. They have well-defined fields, etc. An analogy to
RISC/CISC instruction set architectures is appropriate.

> your program and information exchange simple, and the user will actually use
> the feature you make available on command line, config file, and programmer
> will be able to extend and debug the information being exchanged over the
> net.

A agree that at its basic level the interface to a CLA processor should
be relatively simple. But it also must be flexible and in my mind
such flexibility is easily managed with a parser generator.

> - Don't let the availability of a tool (solution) dictate the solution to a
> problem: keep the solution comensurate with the problem

Absolutely. But don't discount a tool due to unfamiliarity.

> - Plan for extendability of your application (not library) only so far;
> requirements will change, complexity will change, people with important
> knowledge will leave the project, understanding of the problem will augment
> dramatically, etc

Keep interfaces as simple as possible but allow for flexibility.
I think that's what you're saying here. Is that right? If so I
agree with you.

> - Keep the solution as simple as possible: avoid grammar if you don't need
> it, ie in command lines, config files, and internet data transfer;

The whole discussion right now is focusing on whether grammars are
appropriate for CLA processing. You can't just declare they are
not when many people disagree with you.

In my mind, the command-line and config file are one and the same.
Why make the user learn multiple syntaxes?

> - Use the proper tool for the problem: if *don't need* grammar, then use a
> simple, straightforward, "grammar-less" parser; maybe that parser uses
> Spirit in the background, but the parser-user shouldn't have to care, and
> shouldn't even have to know anything about EBNF or regexp

In the simplest cases, I absolutely agree with you. I don't think
anyone has argued otherwise. Allowing parser extension may require
knowledge of Spirit, but as Joel pointed out, one can simply use
Spirit to define a CLA specification language. I'm not sure I'm too
hot on that idea because I usually don't like the idea of a
separate tool/config file to do something if it can be expressed in
the source laguage relatively easily. This is why I much prefer Spirit
to YACC.

> - I have nothing against Spirit, but I do against having the command-line
> parser *be* Spirit. Spirit is to command-line parsing what assembly is to
> C++: lower-level of abstraction since grammar is not needed.

What? It is precisely the other way around. Spirit provides the
abstraction (a grammar) to specify solutions in the problem domain
(parsing) rather than in the solution domain (C++). Note that I
am talking about implementation here. The interface to the
programmer may well be something different. Or it might be
Spirit. We need experimentation.

> - A parser class is needed, but should not require knowledge of EBNF or
> regexp. And it is doable.

If you're talking about the interface, I agree with you up to a point.
Flexibility may require exposing some grammar-like thing. If you're
talking about implementation, I disagree. Why not make use of
something that automates a task, provided it is not too expensive
to do so (and again, we need experiments to determine this)?

                          -Dave

-- 
"Some little people have music in them, but Fats, he was all music,
  and you know how big he was."  --  James P. Johnson

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk