Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2007-06-20 19:04:54


Sebastian Redl wrote:
> A few weeks ago, a discussion that followed the demonstration of the
> binary_iostream library made me think about the standard C++ I/O and
> what I would expect from an I/O model.
>
> Now I have a preliminary design document ready and would like to have
> some feedback from the Boost community on it.

Hi Sebastian,

This is an interesting document and you have obviously put a lot of
work into it. My few thoughts follow. I can't claim to have great
insight into this problem, but there have been more than a couple of
times when the limitations of what is currently available have struck me.

** Formatting of user-defined types often broken in practice.

The ability to write overloaded functions to format user-defined types
for text I/O is attractive in theory, but in practice it always lets me
down somewhere. My main complaint is that neither of these work:

typedef std::set<thing> things_t;
operator<<(things_t things) { .... } // doesn't work because things_t
is a typedef

uint8_t i;
cout << i; // doesn't work because uint8_t is actually a char

When I do have a class, I often find that there is more than one way in
which I'd like to format it, but there is only one operator<< to
overload. And often I want to put the result of the formatting into a
string, not a stream.

So for all of these reasons I have more explicit to_str() functions in
my code than operator<<s.

** lexical_cast<> uses streams, should the reversed.

Currently we implement formatters that output to streams. We implement
lexical_cast using stringstreams. Surely it would be preferable to
implement formatters as specialisations of lexical_cast to a string (or
character sequence / output iterator / whatever) and to implement
formatted output to streams on top of that. I suppose you could argue
that the stream model is better for very large amounts of output since
you don't accumulate it all in a temporary string, but I've never
encountered a case where that would matter.

** Formatting state has the wrong scope

Spot the mistake here:

cout << "address of buffer = 0x" << hex << p;

yes, I forget to <<dec<< afterwards, so in some totally different part
of the program when I write

cout << "DEBUG: x=" << x

and it prints '10', I think "10? should be 16!" and spend ages debugging.

But reverting to dec might not be the right thing to do depending on
what the caller was in the middle of doing, so I really want to
save/restore the formatting state. And if I throw or do a premature
return I still want the formatting state to be reverted:

void f() {
   scoped_fmt_state(cout,hex);
   cout << ....;
   if (...) throw;
   cout << .....;
}

Hmm, I think that's too much work. I'd be happy with NO formatting
state in the stream, and to use explicit formatting when I want it:

    cout << hex(x);
OR cout << format("%08x",x);
OR printf(stdout,"%08x",x);

(No, I don't really use printf() in C++ code. But it does have its
strengths; it's by far the best way to output a uint8_t. And it _is_
type safe if you are using a compiler that treats it as special.)

** Too much disconnect between POSIX file descriptors and std::streams

I have quite a lot of code that uses sockets and serial ports, does
ioctls on file descriptors, and things like that. So I have a
FileDescriptor class that wraps a file descriptor with methods that
implement simple error-trapping wrappers around the POSIX function calls.

Currently, there's a strong separation between what I can do to a
FileDescriptor (i.e. reads and writes) and what I can do to a stream.
There is no reason why this has to be the case. It should be possible
to add buffering to a FileDescriptor *and only add buffering*, and it
should be possible to do formatted I/O on a non-buffered FileDescriptor.

In other words:

class ReadWriteThing;
class FileDescriptor: ReadWriteThing;
class Stream: ReadWriteThing;

FileDescriptor fd("192.168.1.1:80"); // a socket
int i=1234;
fd << "GET " << i << "\r\n"; // Unbuffered write, text formatting.

Stream s("foo.bin"); // a file, with a buffering layer
for (int i=0; i<1000; ++i) {
   short r = f();
   s.write(r); // Buffered, non-formatted binary write.
}

** Character sets need support

This is a hugely complex area which native English speakers are
uniquely unqualified to talk about.

I think that a starting point would be for someone to write a Boost
interface to iconv (I have an example that makes functors for iconv
conversions), and to write a tagged-string class that knows its
encoding (either a compile-time type tag or a run-time enumeration tag
or both). Ideally we'd spend a couple of years getting used to using
that, and then consider how it can best integrate with IO.

Regards,

Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk