Boost logo

Boost :

From: Joel de Guzman (joel_at_[hidden])
Date: 2008-03-25 21:10:17


Eric Niebler wrote:
> Joel de Guzman wrote:

>> Eric and I had our share of squabbles in the past ;-) related
>> to how I expected Proto to be. I knew from the start that I
>> needed something like it. Joao Abecasis, a Spirit developer,
>> started one like it, from my urging. Fusion was developed,
>> in part, due to the need for manipulation of heterogeneous
>> data structures, specifically for spirit: syntax trees.
>
> But there still isn't a Fusion tree data structure, is there?

Nope. But I see no need. A fusion sequence can represent a
hierarchy because the elements are heterogeneous. A sequence
element can simply be another sequence.

>> I recall that our biggest disagreement, and one that we
>> resolved soon enough, was my need for flattened sequences.
>> Eric insisted on presenting the results with no flattening,
>> I wanted flattening. For example, I wanted:
>>
>> a >> b >> c >> d
>>
>> to be a flattened fusion sequence, not a tree. I was reluctant
>> to have Spirit go through a 2-step process that necessitated
>> a conversion from a tree to a fusion-sequence:
>>
>> proto-tree-->fusion-sequence
>
> All things come around again. With proto domains and generators, there
> is now a way to get the incremental flattening you originally wanted --
> within a domain, where it belongs. But as I was able to show, the
> current solution in Spirit-2 -- letting proto build the tree and
> flattening it all at once -- actually performs better.

Cool. Actually, in Spirit-2, not all nodes are flattened.
So that makes perfect sense; I'm already doing per node
flattening in the transforms. Perhaps I can take advantage
of this new feature to further simplify Spirit-2?

>> At first, this lead to the development of segmented fusion
>> iterators. Eric actually implemented this:
>> http://lafstern.org/matt/segmented.pdf into Fusion!
>> It was OK and efficient at runtime. Unfortunately, compile
>> time suffered considerably.
>
> To be clear, the compile time of a segmented Fusion algorithm with a
> segmented Fusion data structure is very good. It's when you use a
> non-segmented algorithm with a segmented data structure that the compile
> times get bad -- because you need to flatten the sequence first. Any
> attempt to use an ordinary Fusion algorithm with a hierarchical data
> structure will have this problem.
>
> The biggest problem with segmented fusion is the need to implement
> segmented variants of all the algorithms. That's a non-trivial undertaking.

Yes. And it's always better to deal with it on the sequence level,
(not the iterator level) as we found out. Indeed, that's the
fundamental idea behind SYB.

>> Our endless discussions culminated with:
>>
>> http://article.gmane.org/gmane.comp.parsers.spirit.devel/2765
>>
>> Seems some folks were tuning in to our discussions.
>> The C++ version of "Scrap Your Boiler Plate", a very
>> powerful, well studied and widely used design pattern
>> for generic traversal from the Haskell world:
>> http://portal.acm.org/citation.cfm?id=1159861.1159871
>> references Eric's post above.
>>
>> So far so good, yet, SYB ("Scrap Your Boiler Plate")
>> was not available in a mature form in C++ yet. This then,
>> again, lead to Dan Marsden's proposed traversal library.
>>
>> What Eric ended up doing was to provide a zero-overhead
>> tree structure that's converted very efficiently to a
>> flattened fusion sequence. It was as fast as my non-proto
>> Spirit2 implementation and with some compile time performance
>> hit.
>
> I would be surprised if the flatten-all-at-once approach causes longer
> compile times than the flatten-incrementally approach. I remember
> showing that, since flattening incrementally amounts to pushing at the
> back of a cons list, it causes more templates to be instantiated than
> just building the cons list all at once back-to-front when all the
> elements are known.
>
> If the proto-ified version of Spirit-2 has longer compile times than the
> non-proto-ified version, I suspect the problem is elsewhere.

Right. Perhaps I was under an obsolete impression :P
It might be a good idea to show some new benchmarks on
this? Alas, I no longer have the old non-proto code :(

>> In due time, I'll be using Dan traversal library.
>
> With Proto? Or as a replacement for Proto? How do you intend to use
> Dan's code in Spirit-2?

Proto+Fusion+SYB.

It seems Hartmut already outlined one plan. I've some
more ideas in addition to Hartmut's:

1 Proto+Spirit creates an ET tree from the Spirit grammar
   and rules. No need to flatten, Spirit stores the ET
   nodes as-is plus some node specific info (e.g. sequence,
   alternate tags).
2 Spirit+Fusion+Traversal traverses the ET tree using per-node
   specific specialized algorithms (e.g. parse/generate for
   terminals, sequences, alternates, etc) and generates
   attributes. Traversal deals with the walking strategy (e.g.
   I want to have a full stride over all sequence elements
   as if it was flattened).
3 Attributes are fusion-sequences, stl-sequences and variants
   essentially representing the AST. This AST can again be
   traversed/transformed by Traversal (e.g. for subsequent passes
   over the AST like optimizations and actual backend code
   generation).

If this is a bit unclear at the moment, no worries. The ideas
are just forming :-)

Regards,

-- 
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk