Boost logo

Boost :

From: Maxim Yegorushkin (e-maxim_at_[hidden])
Date: 2005-06-14 02:30:56


On Tue, 14 Jun 2005 08:00:26 +0400, Scott Woods <scottw_at_[hidden]> wrote:

[]

> 3. Lastly, some of the observations (while excellent) seem a bit "macro"
> when a wider view might lead to different thinking. What I am trying to
> suggest here is that the time spent in vector<>::resize is truly
> surprising

So was it for me.

> but its also very low-level. having been through 4 recent refactorings
> of a network framework, I have been surprised at the gains made in other
> areas by conceding say, byte-level processing, in another area.

I'm currently working on an network framework. Three major performance
improvements in several iterations was: a) drop textuality; b) drop a
C++ glue layer which was built over libevent, so I'm now using libevent
directly - this _is_ the framework for me; c) drop using std::vector as a
message buffer.

> To make more of a case around the last point, consider the packetizing,
> parsing and copying thats recently been discussed. This has been related
> to the successful recognition of a message on the stream.
>
> Is it acknowldeged that a message is an arbitrarily complex data
> object?

It is.

> By the time an application is making use of the "fields" within
> a message thats probably a reasonable assumption. So at some point
> these fields must be "broken out" of the message.

A point to note here, is that there may be checkpoints on a message path,
where a message must be read in order to be forwarded. At such points one
wants to avoid whole message parsing.

> Or parsed. Or marshalled. Or serialized. Is the low-level packet (with
> the length)
> header and body) being scanned again? What copying is being done?
> This seems like multi-pass to me.

> To get to the point; I am currently reading blocks off network
> connections
> and presenting them to byte-by-byte lexer/parser routines. These form
> the structured network messages directly, i.e. fields are already plucked
> out.
>
> So which is better? Direct byte-by-byte conversion to structured network
> message or multi-pass?

I'm not sure I understand "byte-by-byte conversion" and "multi-pass".

What I did was breaking a message in two parts: header and body. The
header contains message type and asynchronous completion token stack. Body
contains application protocol specific data. A message is read in a chunk
of memory (which was that vector<char>) and only the header part is
parsed. When a message is forwarded only the header part is rebuild, the
body gets forwarded without any user space copying. Only at the final
destination an application parses the message body.

-- 
Maxim Yegorushkin

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk