Boost logo

Boost :

Subject: Re: [boost] [1.45][website] binary serialization fubar, website needs updating (was: Beta next week?)
From: Bryce Lelbach (admin_at_[hidden])
Date: 2010-10-18 08:03:06


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 18 Oct 2010 11:19:01 +0400
Vladimir Prus <vladimir_at_[hidden]> wrote:

> Bryce Lelbach wrote:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Ramey, just to make you aware:
> >
> > The new XML grammar adds another fatal bug to serialization when building
> > with mingw (there's already a set of issues with mingw) - mingw causes a
> > win32 exception (a stack overflow) when compiling the Qi grammar. I did my
> > best to try and come up with a fix for both this and the existing mingw
> > issues, but I've had no luck. Documentation on debugging mingw is lacking,
> > so I doubt I am going to be able to get serialization to compile with mingw
>
> This sounds really disturbing. XML is generally not rocket science, and
> serialization does not use very much of XML even, so if gcc (even if running
> on windows) is unable to compile an XML parser, maybe something is wrong with
> the parser?

Volodya,

The fact is, while XML might not be rocket science to the end user, the
implementation of a proper XML parser -is- rocket science. Serialization does
not feature anything near a full, validating XML parser. However, consider
this.

 * A well-formed XML parser (e.g. excluding semantic analysis) must implement
   84 production rules (5 are deprecated).
 * The average XML production rule is an expression with 4 (appx 4.2145 to be
   exact) terminals or non-terminals. To give you an idea of what an expression
   template of that size looks like to the compiler, you might want to take a
   look at http://tinyurl.com/37s6qqp.
 * When I talk about the new XML grammar, I am not referring to a newly designed
   grammar. While my initial instinct was to rewrite Ramey's grammar entirely,
   in the end I decided to simply port the grammar to Qi. With only a few minor
   exceptions, the expression templates in the new XML grammar are almost
   identical to the expression templates in the old Spirit Classic grammar
 * Existing C++ compilers were not designed to handle C++ TMP gracefully. To
   elegantly compile code built with expression templates (e.g. code built on
   top of the Proto library), one would need a C++ compiler that has been
   designed with specific consideration given to advanced C++ programming
   techniques such as template metaprogramming. Until I write that (check back
   in with me in a few months), the best we have is GCC or clang.
 * GCC does not have any problems with this code on Linux whatsoever; I have
   compiled Spirit projects with GCC in the past that pushed the compiler to
   it's limits. This is not one of them. Relatively speaking, this grammar is
   pretty lightweight.

Despite all of the above, you are correct. The parser is at fault, because the
parser is the code that I have the ability to manipulate. In an ideal world, I'd
be able to fix the underlying problem with mingw, get mingw to accept my patch,
and get mingw serialization users to update their mingw to trunk. However, I am
content to try and fix things by modifying my grammar.

Unfortunately, the problem arises from an ICE, and I have been unable to find
any reliable instructions on how to build mingw with debugging symbols (the
mingw configure script from mingw's svn trunk will not work on either Windows
or Linux; the mingw website is outdated and information is sparse).

So, at this point I would taking shots in the dark. Usually, if I have a Spirit
grammar that causes a GCC ICE on Linux, I can run GCC under GDB and locate the
specific part of my code that it is choking on (I used to do some work on GCC,
so I have a decent understanding of it's internals, enough to allow me to
extract the information I need in GDB).

My hope is that after I've spent some time investigating the ICE on Intel's
compiler, I'll have a better idea of where in the grammar the problem is
happening. Using my template instantiation profiler and GCC's
- -fdump-class-hierarchy/-fdump-tree-all options, I can easily locate the
template metaprograms in the grammar that have the greatest time complexity
and memory costs. However, given that GCC on Linux can compile this code without
significant strain on system resources, and given that mingw is causing a win32
stack overflow exception, the location of this issue might not have any
correlation to the most intensive metaprograms in the grammar.

If mingw documentation was less sparse, my motivation to fix this would be
much greater. I'll certainly continue to work on this (if I don't make any
progress I'll contact the mingw mailing list), but my exception is this will
not be easy to pin down.

 - Bryce

- --
Bryce Lelbach aka wash
http://groups.google.com/group/ariel_devel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAky8N3oACgkQO/fqqIuE2t7iqQCgsHaOCYFfMENJQ9iAv9QjLYx2
F6oAnRFz+r6BLS69WNWJ5p8S47Fw25KH
=s3NZ
-----END PGP SIGNATURE-----


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk