Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2005-04-09 17:37:25


George M. Garner Jr. wrote:
> Johnathan,

Hi George,

>
> The Iostreams library provides for built in buffering of filters
> which in principle should permit efficient use of filters even with
> streams that do not support filtering. However, in practice it does
> not appear to live up to expectations. More specifically, the following line
in
> indirect_streambuf.hpp can lead to extremely fragmented reads to lower
> filters or devices:
>
>
> std::streamsize indirect_streambuf<T, Tr, Alloc, Mode>::xsgetn
> (char_type* s, std::streamsize n)
> {
> [...]
> streamsize amt = obj().read(s + avail, n - avail, next_); // n
> - avail may equal 10 or less
>
> [...]
> }
> This is true even if the filter buffer size is set to 1MiB or more.

This seems to be an optimization gone awry. The intended buffering policy,
reflected correctly (I hope) in underflow(), is to fill the input buffer as soon
as a read request is received, regarless of the size of the request, and to fill
read requests from the buffer until it is empty, at which point it will be
filled again.

The implementation of xsgetn works well for large read requests, since a single
read is performed rather than a sequence of reads. You seem to be right, though,
that it is unsatisfactory for small reads. Whether this is bad for performance
depends on the size of n in typical filtering situations. I beleive that n
should typically equal the buffer size for all filters and devices in a chain
other than the first, but that for the first filter or device, n will tend to
reflect the i/o operations performed by the end user, and so may turn out to be
small. I'd be interested to know if this is consistent with your experience.

One fix would be to have xsgetn fill read requests directly from the source only
if they are large, and otherwise use the underflow strategy. I'm inclined,
however, just to scrap xsgetn (and perhaps xsputn), and rely soley on underflow
(and overflow).

I'd appreciate it if you would comment out the declaration and implementation of
xsgetn and see if you still experience problems.

> Large buffer sizes (> 2 MiB) greatly enhances read and write
> performance on many modern operating systems. This is particularly
> felt when reading and writing large files (> 4 GiB). Ideally, if you
> set a filter buffer policy (e.g. to 1 MiB) you would like all reads
> to lower filters and devices to request that value, except possibly
> for the last read.

Right.

> Fortunately, it is relatively trivial to disable Iostreams buffering
> altogether and write a buffering_shim_filter.

You can simply set the buffer size to zero when you add a filter to a chain.

> But then I do not
> understand what purpose Iostreams buffering serves. I Googled
> through the online documentation and I didn't find a detailed
> discussion of its objectives, though there were some comments that
> touched on the subject matter during
> the review process. Perhaps you can further elaborate this subject
> matter.

The purpose is to minimize the number of function calls (for filters and
devices) and to minimize the number of potential expensive accesses to external
devices (mostly for devices).

Thanks for digging into the iostreams internals!

> Regards,
>
> George.

Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk