[Boost-bugs] [Boost C++ Libraries] #1498: xml parser: iteration instead of recursion in 'content' rule

Subject: [Boost-bugs] [Boost C++ Libraries] #1498: xml parser: iteration instead of recursion in 'content' rule
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2007-12-03 16:31:47


#1498: xml parser: iteration instead of recursion in 'content' rule
-------------------------------------+--------------------------------------
 Reporter: j.swoboda_at_[hidden] | Owner: ramey
     Type: Patches | Status: new
Milestone: | Component: serialization
  Version: Boost 1.34.1 | Severity: Problem
 Keywords: |
-------------------------------------+--------------------------------------
 The 'content' rule in basic_xml_grammar.ipp contains a recursion, which
 leads to stack overflows if the serialized data contains many escaped
 characters.

 The end user of our application may serialize arbitrary binary data, and
 the msvc linker limits the stack size to 1 MB by default. Deserialization
 of a std::string containing the Verdana.ttf font file that comes with
 every Windows installation fails, it requires about 2 Mb of stack space.
 While it is possible to increase the stack size of the executable it still
 would not ensure the deserialization of arbitrary data.

 Proposed change:
 The 'content' rule matches the delimiter '<' or a sequence of one or more
 'Reference' or '!CharData' rules followed by the delimiter '<'. This
 requires both 'Reference' and '!CharData' to not match an empty string,
 thus the '!CharDataChars' rule uses the Positive operator instead of the
 Kleene star. The use of '!CharData' in the rule '!UnusedAttribute' has to
 be adapted by prepending '!CharData' with the Optional operator.

 Effect:
 Only the deserialization is affected, serialized files are identical. The
 'content' rule iterates over the data instead of recursing into itself,
 which requires less than 128 Kb stack space for the mentioned example
 file.

 A diff of basic_xml_grammar.ipp follows.

 {{{
 272c272
 < CharDataChars = *(anychar_p - chset_p(L"&<"));
 ---
> CharDataChars = +(anychar_p - chset_p(L"&<"));
 308c308
 < | (Reference | CharData) >> content
 ---
> | +(Reference | CharData) >> L"<"
 371c371
 < >> CharData
 ---
> >> !CharData
 }}}

--
Ticket URL: <http://svn.boost.org/trac/boost/ticket/1498>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.


This archive was generated by hypermail 2.1.7 : 2017-02-16 18:49:57 UTC