Boost logo

Boost :

From: Sebastian Redl (sebastian.redl_at_[hidden])
Date: 2007-09-27 19:42:03


Hi,

Some people might remember that I had plans for a new I/O system. Well,
today I finished two important milestones, so I thought I'd give a very
brief update as to what I was up to.

By now, I've designed and implemented:
1) Concepts for reading, writing, mark/reset and tell/seek.
2) Various forms of in-memory sources and sinks, fully tested.
3) Various forms of file sources and sinks, implemented for POSIX and
Win32 but as of yet untested.
4) A read-ahead buffer. (Reads a big block and serves further requests
out of that.)
5) A chaining system to conveniently build chains out of these components.
6) A type erasure source to hide type complexity.

Right now, it is possible to construct a buffered in-memory source
(yeah, like that's useful) with this code:

source<octet> src( read_mem_unowned(sample_data, sample_size) |
buffer<8>() );

This construct a source that reads octets from the memory area at
sample_data (which has a size of sample_size units), buffered by an
8-byte read-ahead buffer. The resulting sink uses type erasure to get
the short type. Overhead is exactly one virtual call for every
operation. If that's too much for you, you can specify the full type of
the source:

fixed_readahead_buffer_filter<non_owned_mem_source<octet>, 8>

If you do that, GCC inlines the entire read operation under -O3 - it all
boils down to a few memmoves. (It's a few because buffering in-memory
operations only adds overhead without benefit. But whatever.)

Conversely, if you wanted to write data, you could do this (not yet
implemented):

sink<octet> snk( write_mem() | buffer<8>() );

The chaining system is powerful enough to allow the chain stubs to
select the appropriate type. The full type of the generated sink is:

fixed_write_collect_buffer_filter<mem_sink<octet>, 8>

The problem here is getting the data out. I've thought of a system for
doing that in such a case, but that's for a separate post. For now, you
could just construct the sink separately and then chain it in:

mem_sink<octet> data_sink;
sink<octet> snk( chain(data_sink) | buffer<8>() );

Now snk holds a reference to data_sink. (So data_sink must stay valid.)
And you can get at the sink's underlying vector by calling data_sink.data().

If you don't like the operators, you can use chain() instead:

chain(write_mem()).chain(buffer<8>())

Under C++0x, you can get use auto and decltype to get the exact type
with less typing with something like this:

auto sink_spec(write_mem() | buffer<8>());
decltype(sink_spec)::stream_type snk(sink_spec);

Also not implemented, but planned, are text streams. Reading an
ISO-8859-1 text file in the finished library would look like this:

source< text<utf_8> > src( // A source of UTF-8 text.
    read_file(filename) | buffer<1024>() | text_decode<utf_8>("ISO-8859-1")
  );

Things I have to do now:
1) Implement type erasures for sinks and bidi-devices.
2) Make type erasures gracefully handle missing functionality in the
underlying devices. That is, if a device can't handle seeking, it
shouldn't have to implement no-op or throwing stubs for the seek
methods. The erasure should do that.
3) Implement lots more components.
4) Test the file devices.
5) Make writing components easier. Right now, using components is easy,
but writing them a bit of a mess due to the chain system.
6) Get started on the text parts.

I hope I'll be back soon with more to tell.

Sebastian Redl


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk