Boost logo

Boost :

Subject: Re: [boost] Scalpel: a Spirit&Wave-powered C++ source code analysis library
From: Florian Goujeon (florian.goujeon_at_[hidden])
Date: 2010-09-07 00:30:00


  Hi Doug,

I've been thinking about it for the past five days, and I must face
it: I'm the only one developer of Scalpel, while Clang is much more
advanced and maintained by a whole community, I don't have any chance.
Besides, when you say:

> New projects need critical mass to effectively
> compete with established projects, and without effective competition
> we don't see the benefits of diversity; we just see redundancy.

I finally have to admit you're totally right.
I thought the round-trip engineering/refactoring feature I planned to
develop would have made Scalpel unique, but it seems Clang's
librewrite already does it.
It's terribly hard to accept, but c'est la vie.

However, this would be a pity to throw all my work away.
I could try something. It's related to Dave's suggestion:

> One area that scalpel could conceivably find a niche, depending on how
> you do it, would be in analyzing source code without seeing the full
> translation unit (as you might for syntax-coloring purposes). Since
> CLANG is really built to be a compiler, I don't think it can do that.

Scalpel could actually do it.
While it's true that C++ is a "ridiculously ambiguous language", it
turns out that Scalpel's design has something special.
I wanted the syntax analyzer to be very loosely coupled with the
semantic analyzer. Consequently, the syntax analyzer is standalone.
The Spirit grammar doesn't run any semantic action.

*****

At this point, you may wonder how I planned to manage syntax
ambiguities.
There is two types of syntax ambiguity cases:
1) cases where there's always an interpretation which is more obvious
than the other one(s);
2) cases where you may reasonably ask the programmer to disambiguate
its code.
Whatever the case, the syntax analyzer (predictably) chooses one
of the interpretations.
Here are some examples:

The following line of code…:
a * b;
… may be either a multiplication or a pointer declaration.
The default interpretation is the pointer declaration. You can
reasonably ask the programmer to disambiguate the code by putting
parenthesis if he wants the syntax analyzer to interpret it as the
former:
(a * b);

Trickier. The following line of code…:
a < b || c > d;
… may be either a boolean expression (a, b, c and d are variables of
type bool) or a variable declaration (whose name is 'd' and whose type
is a<b || c>, where 'a' is a class template taking one bool template
parameter and where 'b' and 'c' are both variables of type const
bool).
The default interpretation is the boolean expression. You can
reasonably ask the programmer to disambiguate the code by putting
parenthesis if he wants the syntax analyzer to interpret it as the
latter:
a < (b || c) > d;
Actually, I even wonder why the standard allows such ambiguities.

Note: Scalpel successfully parses Apache's implementation of the C++
standard library.

*****

I could extract the syntax analyzer of Scalpel to create such a
library.
This would save one year of work out of two and make the syntax
analysis even more generic than it would have been by staying
encapsulated in Scalpel.

HOWEVER. I started the Scalpel project for two reasons:
1) I would have liked to develop/use a kind of UML tool with
round-trip engineering capabilities (i.e. able to generate a class
diagram from the source code and able to synchronize the source code
after a modification of that diagram) which would have used Scalpel.
2) I'm a 24 year old software engineer and completing such a complex
project could have been good for my starting career.

Developing a syntax analysis library is far less impressive than
developing a full front-end. So, point 2 is out.
So is point 1, for obvious reasons.
It seems like I don't have significant interest in starting this
project. UNLESS…

Of course, I like to code and I would be glad not to throw my whole
two-year work away. Besides, just like Doug said: "I want to see
great, new ideas in C++ parsing and development tools", just for the
sake of C++.
But this would be even better if my career (and, secondarily, my
personal satisfaction) could still take advantage of it.
This is why is need to know: is there a reasonable chance that such a
library will be accepted into Boost?
This would be a significant motive for me.

> I won't try to dissuade you further, because I've been in precisely
> the same position as you are now. Best of luck to you!

Thank you anyway ;).

>> If one day Scalpel is accepted into Boost, I'll release it under the
>> BSL without any hesitation.
>
> One note of caution: if you start getting contributions from others,
> you'll have to ask permission of each and every one of them when you
> want to switch licenses. Boost went through this when we switched over
> to the Boost Software License, and it's a real pain in the butt.
> Better to switch to the license you want now, or (barring that) get
> copyright assignment along with each contribution (as is done by the
> FSF) to ensure that you can easily switch later.

I planned to apply the latter for Scalpel. For the hypothetical new
library, I'll switch to the BSL right in the beginning.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk