Boost logo

Boost :

From: Scott Woods (scottw_at_[hidden])
Date: 2005-06-13 23:00:26


----- Original Message -----
From: "Maxim Yegorushkin" <e-maxim_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Monday, June 13, 2005 11:59 PM
Subject: Re: [boost] [Ann] socketstream library 0.7

[snip]
> >> ... And too slow. You have one data copy from kernel into the
streambuf,
> >> and another one from the streambuf to the message object. The same is
> >> for
> >> output: message -> streambuf -> socket. This is unacceptable, at least
> >> for
[snip]
> >> 30% of user time was spent in guess where? In zeroing out memory in
> >> std::vector<char>::resize(). And you are talking about data copying
> >> here...
> >
> > Considering the protocol of your application has built in methods for
> > announcing the length of the payload, your requirement is met by the
> > streambuf::sgetn(char_type*, streamsize) method, for a properly
> > specialized implementation of the virtual protected xsgetn method.
[snip]
>
> > So you get operator semantics for free. :-)
> > And perhaps even a putback area, if there's one provided by this
> > particular streambuf implementation.
>
> Sounds interesting, but I don't see how this can work well with
> nonblocking sockets. You have to store how much bytes have already been
> read/sent somewhere.
[snip]

There is really interesting material here. There is also other stuff that I
feel
obliged to comment on :-)

1. Some of the contortions suggested to effectively read messages off
an iostream socket are not specific to the fundamental network/socket
goals, i..e they are just difficulties associated with iostream-ing.

2. Some of those same contortions are in response to the different
media (not sure if thats the best term) that the stream is running over,
i.e. a network transport. This is more ammunition for anyone trying
to shoot the sync-iostream-model-over-sockets down. Or at least
suggest that the model is a constraint to those wrtiing iostream-based
network apps.

3. Lastly, some of the observations (while excellent) seem a bit "macro"
when a wider view might lead to different thinking. What I am trying to
suggest here is that the time spent in vector<>::resize is truly surprising
but its also very low-level. having been through 4 recent refactorings of a
network framework, I have been surprised at the gains made in other
areas by conceding say, byte-level processing, in another area.

To make more of a case around the last point, consider the packetizing,
parsing and copying thats recently been discussed. This has been related
to the successful recognition of a message on the stream.

Is it acknowldeged that a message is an arbitrarily complex data
object? By the time an application is making use of the "fields" within
a message thats probably a reasonable assumption. So at some point
these fields must be "broken out" of the message. Or parsed. Or
marshalled. Or serialized. Is the low-level packet (with the length)
header and body) being scanned again? What copying is being done?
This seems like multi-pass to me.

To get to the point; I am currently reading blocks off network connections
and presenting them to byte-by-byte lexer/parser routines. These form
the structured network messages directly, i.e. fields are already plucked
out.

So which is better? Direct byte-by-byte conversion to structured network
message or multi-pass?

Cheers.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk