Boost logo

Boost :

From: Andrew Sutton (asutton_at_[hidden])
Date: 2007-09-02 08:14:07


> I carefully said it was "my definition" because I'm aware there are
> multiple different interpretations. Why do you consider your
> definition to be the 'right' one? :)

The definition that commonly occurs in refactoring literature:
preserving exterior behavior (or was it external, or outward facing?)
I don't remember exactly. It amounts to any transformation that
preserves the successful execution of a test program.

>> I would also
>> point out that in order to ensure that a program is correct it first
>> has to be preprocessed, and parsed. Something as simple as renaming a
>> function - which is a well-known refactoring - requires none of that.
>
> Actually, you're wrong. C and C++ require token pasting and escaped
> newline splicing to happen - and they certainly can occur in
> identifiers. That is, unless you're willing to break some correct
> code, various hacks like using sed can sometimes work... be careful
> of scoping issues though, particularly when macros can expand into
> {'s :)

Yeah, but you're still talking about transformations on structured
text. Having a lexically correct program is a pretty far cry from
actually an actually correct program - which I agree is important.
You may also have the cases where you may want to operate directly on
macros without expansion - or on header inclusions without inclusion.

>> The complexity of the refactoring determines the amount of
>> information needed - whether or not you actually need a fully correct
>> AST all the time - I doubt it.
>
> Certainly, it obviously depends on the transformation.

Software engineering research literature is actually a good place to
go to see how people are trying to deal with these problems (that is
if you can still find people trying to work with C++ - most
researchers prefer Java these days). One of the conclusions from all
this work is that there's a distinct difference between a compiler
and what sometimes gets called a reverse engineering parser. There's
a tradeoff between correctness and robustness in their ability to
work with more code and under different conditions - like in an
editor, or in absence of a correct build.

I guess I'm trying to say that there are a broad class of operations
on source code that require a holistic view of the text rather than a
fully preprocessed and lexical view - and the ability to build a
partial AST on top of it. I think it will be interesting to see if
llvm/clang is capable of addressing the two different approaches to
source code analysis.

Andrew Sutton
asutton_at_[hidden]


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk