|
Boost : |
From: Jeremy Maitin-Shepard (jbms_at_[hidden])
Date: 2007-06-17 23:13:09
Sebastian Redl <sebastian.redl_at_[hidden]> writes:
> A few weeks ago, a discussion that followed the demonstration of the
> binary_iostream library made me think about the standard C++ I/O and
> what I would expect from an I/O model.
> Now I have a preliminary design document ready and would like to have
> some feedback from the Boost community on it.
> The document can be found here:
> http://windmuehlgasse.getdesigned.at/newio/
> I'd especially like input on the unresolved issues, but all comments are
> welcome, even if you tell me that what I'm doing is completely pointless
> and misguided. (At least I'd know not to waste my time with refining and
> implementing the design. :-))
I am pleased to you taking an interest in a new I/O library for C++.
The existing C++ I/O facilities have always bothered me, but I've never
gotten around to trying to write something better. I have a number of
comments. They aren't particularly well structured, because I didn't
bother to try to reorganize them after initially just writing down
thoughts as they occurred to me.
- I think it is important to look at the boost iostreams architecture,
and make sure to include or reuse any of the ideas or even actual code
if possible. One idea from that library to consider is the
direct/indirect device distinction.
- Binary transport layer issue:
Make the "binary transport layer" the "byte transport layer" to make
it clear that it is for bytes.
Platforms with unusual features, like 9-bit bytes or inability to
handle types less than 32-bits in size can possibly still implement
the interface for a text/character transport layer, possibly on top of
some other lower-level transport that need not be part of the boost
library. Clearly, the text encoding and decoding would have to be
done differently anyway.
- Asynchronous issue:
Asynchronous I/O is extremely useful, but it also requires a very
different architecture --- something like asio io_service is needed to
manage requests, a function to call on completion or error must be
provided.
One issue is that there are very large differences between platforms
(Windows and Linux). On Linux, asynchronous I/O via efficient polling
for readiness is possible for sockets and pipes using epoll (and
somewhat less efficiently using select and poll), but these mechanisms
cannot be used for regular files. I think there may be other
asynchronous I/O mechanisms on Linux that do support regular files, at
least on some filesystems, but which are not very easily compatible
with epoll and other methods suitable for sockets. Furthermore, even
if read and write are asynchronous, open will always be synchronous on
Linux. It may not be feasible, therefore, to implement a proper
asynchronous I/O interface on Linux. Even on Windows, I belive it may
not be possible to get asynchronous open.
Thus, I think I agree that it would be better to avoid including an
asychronous I/O interface in this library, although probably a bit
more thought should go into the decision before it is made.
- Seeking:
Maybe make multiple mark/reset use the same interface as seeking, for
simplicity. Just define that a seeking device has the additional
restriction that the mark type is an offset, and the argument to seek
need not be the result of a call to tell.
Another issue is whether to standardize the return type from tell,
like std::ios_base::streampos in the C++ iostreams library.
- Binary formatting (perhaps the name data format would be better?):
I think it is important to provide a way to format
{uint,int}{8,16,32,64}_t as either little or big endian two's
complement (and possibly also one's complement). It might be useful
to look at the not-yet-official boost endian library in the vault.
A similar variety of output formats for floating point types should
also be supported.
It is also important to provide the most efficient output format as
an option as well (i.e. writing the in-memory represention of the
type directly, via e.g. reinterpret_cast). It should probably also
be possible to determine using the library at compile time what the
native format is. It is not clear what to do about the issue of some
platforms not using any standard format as its native format.
- Header vs Precompiled:
I think as much should be separately compiled as possible, but I also
think that type erasure should not be used in any case where it will
significantly compromise performance.
- The "byte" stream and the character stream, while conceptually
different, should probably both be considered just "streams" of
particular POD types. The interfaces will in general be exactly the
same as far as reading, writing, seeking, filtering.
- Text transport:
I don't think this layer should be restricted to Unicode encodings.
Rather, a text transport should just be a "stream" of type T, where T
might be uint8_t, uint16_t, uint32_t depending on the character
encoding. For full generality, the library should provide facilities
for converting between any two of a large list of encodings. (For
simplicity, some of these conversions might internally be implemented
by converting first to one encoding, like UTF-16, and then converting
to the other encoding, if a direct conversion is not coded specially.)
I think it is important to require that all of a minimal set of
encodings are supported, where this minimal set should include at
least all of the common unicode encodings, and perhaps all of the
iso-8559-* encodings as well, in addition to ASCII.
- Text formatting:
For text formatting, I think it would be very useful to look at the
IBM ICU library. It may in fact make sense to leave text formatting
as a separate library (for example, as a unicode library), since it is
somewhat encoding specific, and a huge task by itself and not very
related to this I/O library. As long as the I/O library provides a
suitable character stream interface, an arbitrary formatting facility
can be used on top of it.
-- Jeremy Maitin-Shepard
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk