Boost :

Date view	Thread view	Subject view	Author view

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2004-08-30 11:01:43

Next message: Richard Hadsell: "[boost] Building Boost with PathScale compiler"
Previous message: Alex Chovanec: "[boost] Re: Review schedule"
In reply to: Daryle Walker: "Re: [boost] IOStreams formal review start"
Next in thread: Jonathan Turkanis: "[boost] Re: IOStreams formal review start"
Reply: Jonathan Turkanis: "[boost] Re: IOStreams formal review start"
Reply: Daryle Walker: "Re: [boost] IOStreams formal review start"

"Daryle Walker" <darylew_at_[hidden]> wrote:

> On 8/28/04 8:09 PM, "Jeff Garland" <jeff_at_[hidden]> wrote:
>
> > Today (August 28th, 2004) is the start of the formal review of the Iostreams
> > library by Jonathan Turkanis. I will be serving as review manager. Note
that
> > this is a somewhat unusual situation in that we have several libraries that
> > overlap in the same area, so comments related to the MoreIo overlap are
> > needed.
> >
> >
> > http://home.comcast.net/~jturkanis/iostreams/

Thanks for the review!

> Some concerns over what qualifies:
>
> 1. Aren't memory-mapped files and file descriptors highly platform
> specific?

Yes, just like threads, sockets and directory iteration.

> Code that works with them would have to be non-portable, so I
> don't think they're appropriate for Boost.

It achieves portability the same way boost.thread and boost.filesystem do: by
having separate implementations for different systems. See
http://www.boost.org/more/imp_vars.htm ("Implementation variations").

> 2. This library does what a lot of other text-I/O libraries do, try to fit
> in "kewl" compression schemes. The problem is that the types of compression
> here are binary oriented; they convert between sets of byte streams.
> However, characters are not bytes (although characters, like other types,
> are stored as bytes).

Are you saying there are problems with the implementation of the compression
filters, e.g., that they make unwarranted assumptions about 'char'? If so,
please let me know. I'm sure it can be fixed.

> There are issues of character-to-byte-sequence
> conversion. These issues, and binary I/O itself, should _not_ be snuck in
> through a text-I/O component. Worse, we have an upcoming library
> (Serialization) that also has to deal with binary I/O stuff, so we should
> have a binary I/O strategy. That means that the compression stuff should be
> skipped for now.

> (If someone says that this library is supposed to be text and/or binary
> I/O, that's even worse! The two types should be distinct and not just
> glommed together.)

I don't see the iostream framework as relating to text streams only: streams
can handle text and binary. In some cases, you want text and binary to work
together. E.g., suppose you have a compressed text file ("essay.z") and you want
to read a 'family-friendly' version of it. You can do so as follows:

    filtering_istream in;
    in.push(regex_filter(regex("damn"), "darn"));
    in.push(zlib_decompressor());
    in.push(file_source("essay.z"));
    // read from in.

Isn't this perfectly natural and convenient? What's wrong with using the
decompressor and the regex filter in the same chain?

> Now to the review itself:

I thought it had already started ;-)

> 1. Actual code using this library is very slick and easy to set up. This
> ease of use/set-up also applies to the plug-in filters and/or resources.

Thanks.

> The end-user experience is so awesome that we could just say "ship it"
> and go home. However, the experience is the only good point. That ease
> comes with a nasty little price tag, or more accurately, a nasty BIG price
> tag.
>
> The core question on keeping this library is:
>
> Besides filtering/chaining, is there any part of the interface that
> isn't already covered by the standard stream (buffer) system?

Can I rephrase this as follows: InputFilters and OutputFilters are a useful
addition to the standard library, but Sources and Sinks just duplicate
functionality alread present? If this is not your point please correct me.

There are two main resons to write Sources and Sinks instead of stream buffers:

1. Sources and Sinks and sinks express just the core functionality of a
component. Usually you have to implement just one or two functions with very
natural interfaces. You don't have to worry about buffering or about putting
back characters. I would have thought it would be obvious that it's easier to
write:

        template<typename Ch>
        struct null_buf {
            typedef Ch char_type;
            typedef sink_tag category;
            void write(const Ch*, std::streamsize) { }
        };

than to write your null_buf, which is 79 lines long.

2. Sources and sinks can be reused in cases where standard streams and stream
buffers are either unnecessary or are not the appropriate abstraction. For
example, suppose you want to write the concatenation of three files to a string.
You can do so like this:

    string s;
    boost::io::copy(
        concatenate(
            file_source("file1"),
            file_source("file2"),
            file_source("file3")
        ),
        back_insert_resource(s)
    );

This is IMO clearer and possibly much more efficient than:

    string s;
    ofstream file1("file1");
    ofstream file2("file2");
    ofstream file3("file3");
    ostringstream out;
    out << file1.rdbuf();
    out << file2.rdbuf();
    out << file3.rdbuf();
    s = out.str();

(Note: concatenate is unimplemented, pending resolution of some open issues such
as the return type of 'write').

Another example, for the future, might be resources for asynchronous and
multiplexed i/o. (See 'Future Directions', http://tinyurl.com/6r8p2.)

> The whole framework seems like "I/O done 'right'", a "better" implementation
> of the ideas/concepts shown in the standard I/O framework.

I'd say thanks here if 'right' and 'better' weren't in quotes ;-)

> The price is a
> code size many times larger than the conventional system,

Are you talking about the size of the libray or the size of the generated code?
Most of the filter and resource infrastructure, as found in headers such as
<boost/io/io_traits.hpp>, <boost/io/operations.hpp> and
<boost/io/detail/resource_adapter.hpp>, should compile down to nothing with
optimizations enabled. A typical instantation of streambuf_facade should be only
slightly larger, if at all, than a typical hand-written stream buffer. The main
difference is that more protected virtual function may be implemented than are
actually needed. Even this can be fixed, if necessary; at this point it seems
like a premature optimization, unless you have some data.

> and a large chunk
> of it is a "poor man's" reflection system.

Do you mean the i/o categories? This follows the example of the standard library
and the boost iterator library. It's better than reflection, since you can't get
accidental conformance.

> 1a. An issue that appeared during the previous review (my More-I/O library)
> was the necessity of the stream base classes that could wrap custom
> stream-buffer classes. The complaint was that the wrappers couldn't be a
> 100% solution because they can't forward constructors, so a final derived
> class has to be made the manually carries over the constructors. Well, that
> is true, mainly because C++ generally doesn't provide any member forwarding
> (besides in a limited way with inheritance). The sample stream-buffer in
> More-I/O generally had added-value member functions attached, that perform
> inspection or (limited) reconfiguration. Those member functions also have
> to be manually carried over to the final derived stream class. The point
> (after all this exposition) is that the More-I/O framework at least
> acknowledges support for value-added member functions. The Iostreams
> framework seems to totally ignore the issue! (So we could have a 90%
> solution for a quarter of the work, but triple the #included code length!)

With a streambuf_facade or stream_facade you can access the underlying resource
directly using operators * and ->. E.g.,

    stream_facade<tcp_resource> tcp("www.microsoft.com", 80);
    ...
    if (tcp->input_closed()) {
       ...
    }

Maybe I should stress this more in the documentation. (I imagine some people
won't like the use of operators * and -> here, but these can be replaced by a
member functions such as resource().)

> 2. Another issue that appeared during the More-I/O review was that plug-ins
> for Iostreams could be used for other ultimate sources/sinks unrelated to
> standard streams. What would those be? If a system in question supports
> Standard C++, it probably uses the Standard I/O framework for (text) I/O.
> If the system doesn't/can't use Standard I/O, it has to use a custom I/O
> framework. To use it with Iostreams framework plug-ins, an adapter would
> have to be made for the custom I/O routines. In that case, an adapter could
> be made for the Standard I/O framework instead. (As I said in the prelude,
> I'm skipping over binary-I/O concerns.)

I don't follow this.

> To sum it up:
>
> I would REJECT the library (at least for now). Maybe chain-filtering
> stream-buffer and stream base classes could be made instead?
>

Sorry to hear that -- I hope I can change your mind.

Jonathan

Next message: Richard Hadsell: "[boost] Building Boost with PathScale compiler"
Previous message: Alex Chovanec: "[boost] Re: Review schedule"
In reply to: Daryle Walker: "Re: [boost] IOStreams formal review start"
Next in thread: Jonathan Turkanis: "[boost] Re: IOStreams formal review start"
Reply: Jonathan Turkanis: "[boost] Re: IOStreams formal review start"
Reply: Daryle Walker: "Re: [boost] IOStreams formal review start"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk