I've seen Boost.Build support for ANTLR (the parser generator) discussed but haven't seen it generically implemented. It seems to me that the biggest trick in doing so is generating the non-trivial grammar file dependencies and output files. I say non-trivial because, as of ANTLR 3.1, you can split grammars across multiple files, importing definitions between them using the "import" directive. (This splitting is often necessary for large grammars, which would otherwise generate parsers too large from some tools, most notably the Java "code too large" issue.)

For example, suppose I have a lexer grammar defined in Lexer.g, a base parser grammar in Base.g, and a derived composite grammar in Derived.g, which contains the directive "import Lexer, Base;". Running ANTLR on Derived.g with the -depend option generates the following output:

Derived.g: Lexer.g, Base.g
DerivedParser.c : Derived.g
.\Derived.tokens : Derived.g
DerivedParser.h : Derived.g
DerivedLexer.c : Derived.g
DerivedLexer.h : Derived.g
Derived_Lexer.c : Derived.g
Derived_Base.c : Derived.g

It seems like the most future-proof way to handle these dependencies is to invoke ANTLR to produce them (as above). Since ANTLR is a Java program with not insignificant startup time, writing these dependencies to a static dependency file (which in turn depends on the source file) seems like the best approach.

So my question: Since this dependency output looks almost like a makefile (except for the commas), and other tools produce similar output (e.g. gcc -M), are there any existing tool scripts or generators that I could leverage or learn from? Can the built-in scanner support be used for this, given that a) the file containing the dependencies is not itself the dependent file and b) there can be an arbitrary number of dependencies per line?

I've used Boost.Build for awhile, but I've never tried to extend it. I've read over the Extender Manual, but it seems like pretty nuanced stuff, so any guidance would be appreciated. For instance, I have no clue how to read the contents of a file, even after studying scanner.jam.

Thanks,
Trevor