Boost logo

Boost :

From: Pedro Lamarão (pedro.lamarao_at_[hidden])
Date: 2005-06-15 12:45:43

Scott Woods wrote:

> 1. The difference (in terms of CPU time) in maintaining a counter
> and inspecting a "current byte" and testing it for "end of message"
> seems minimal. This is stated relatively, i.e. it is far more significant
> that the bytes sent across the network are being scanned at the
> receiver more than once. Even maintaining the body counter is
> a (very low cost...) scan.
> 2. An approach using lex+parse techniques accepts raw byte
> blocks as input (convenient) and notifies the user through some
> kind of accept/reduce return code, that the message is complete
> and already "broken apart", i.e. no further scanning required
> by higher layers.
> 3. Lex+parse techniques do not care about block lengths. An
> accept state or parser reduction can occur anywhere. All the
> "unget" contortions recently mentioned are not needed. Partial
> messages are retained in the parser stack and only finally
> assembled on accept/reduce. This property is something
> much easier to live with than any kind of "fixed-size" approach
> that I have dealt with so far.

This is the kind of application of a network library I'm most intrigued by.

I've experimented with an aproximation of this approach by modifying a
sinister buffering scheme in a C# application by apparently inefficient
calls to the equivalents of send and receive to get only one byte at a
time and implement a simple lexer; I expected terrible losses but
experienced very little of those. Later reapplying a buffering layer at
only two particular points made the difference very difficult to measure.

>>First. We, unfortunately, can't pass std::vector to the operating
>>system, so, at some point, we are allocating fixed sized buffers, and
>>passing it to our IO primitives. There is no escape.
> Errrrr. Not quite following that. Are you saying that
> send( socket_descriptor, &vector_buffer[ 0 ], vector_buffer.size() )
> is bad?

No. What I meant was, the operating system won't resize std::vector for
you. It expects a fixed-size amount of memory.

Because of this, every "dynamically safe buffering" must be a layer over
a "fixed size error-prone" buffering done somewhere. That is a
constraint of our primitives.

The intention of a streambuf implementation is precisely to conceal such
a fixed size buffering, offering the most generic interface to what now
becomes a concealed "sequence" (as the documentation I have at hand
would call it).

> Yes you make some very good points. The product I am currently working
> on is a vipers' nest of the protocols you talk about and more. There have
> been some unpleasant suggested uses for protocols such as IMAP4. Trying
> to build a generic network messaging library that facillitates clear concise
> application protocols *and* can cope with the likes of IMAP4 is, I believe,
> unrealistic.

The skeleton of a "protocol message" as I've been working is more or less:


class protocol_message;
    void clear () { /* Clear data members. */ }

    template <typename IteratorT>
    parse_info<IteratorT> parse (IteratorT begin, IteratorT end);
    // Defined later.

template <typename CharT, typename TraitsT>
basic_ostream<CharT, TraitsT>&
operator<< (basic_ostream<CharT, TraitsT>& o,
            protocol_message const& m);
// However is a message in the net...

template <typename CharT, typename TraitsT>
basic_istream<CharT, TraitsT>&
operator<< (basic_istream<CharT, TraitsT>& i, protocol_message& m)
  using namespace boost::spirit;

  // Here we use the Magic Glue
  typedef multi_pass<std::istreambuf_iterator<char> > iterator_t;
  iterator_t begin(i);
  iterator_t end = make_multi_pass(std::istreambuf_iterator<char>());

  parse_info<iterator_t> info = m.parse(begin, end);
  if (!info.hit)

  return i;

namespace detail
  class grammar : public boost::spirit::grammar<grammar>

    grammar (protocol_message& m) : _M_m(m) {}

    template <typename ScannerT>
    class definition;
    // We'll write in _M_ but the constructor takes a const reference.

    protocol_message mutable& _M_;

template <typename IteratorT>
message::parse (IteratorT begin, IteratorT end) {
  using namespace boost::spirit;
  detail::grammar g(*this);
  return boost::spirit::parse(begin, end, g);


Note how operator>> sets failbit in case of an unsuccessful parse: it
allows us to write:

iostream stream;
protocol_message message;

while (stream >> message)
  // Work.
// Parsing failed or other error; try to recover?

No exception is thrown. But an exception could be thrown; iostream can
be configured to do that, and throw an ios_base::failure.

The current implementation of the irc_client example distributed in the
package I uploaded to the Sandbox is in this URI:

This version has a Spirit grammar for a (modified) version of the IRC
grammar as defined in 2812. It's still rough in the edges, but much
better than used to be.

IRC is a very uninsteresting application, but it's an interesting
protocol to experiment with as there is no guarantee when a message is
coming from where. "Synchronized" protocols like SMTP are much easier;
client sends, server responds, and that's pretty much it.

I'm very interested in these kinds of applications of a "netbuf" and the
implementation of reusable "protocol message" classes for common
protocols; I'm probably going after HTTP next, and try to write a
simplified wget.

There was also a concern earlier in this thread about excessive
buffering in streambuf's with "fixed-sized message" protocols I'd like
to address with an example.

Pedro Lamarão
Intersix Technologies S.A.
SP: (55 11 3803-9300)
RJ: (55 21 3852-3240)
Your Security is our Business

Boost list run by bdawes at, gregod at, cpdaniel at, john at