Boost logo

Boost :

From: Rob Stewart (stewart_at_[hidden])
Date: 2005-03-03 23:33:43

From: "Jonathan Turkanis" <technews_at_[hidden]>
> I. The Problem ---------------------------
> Standard iostreams do not work well with non-blocking or asynchronous i/o. I
> would eventually like to extend the library to provide support for non-blocking
> and async i/o, and when I do so I expect I will have to introduce some new
> Device concepts. However, I would like to modify the *current* filter concepts
> so that they will work unchanged when non-blocking and asynchronous devices are
> introduced.

There's a difference between making the concepts work unchanged
and making the components of the library work unchanged.

> II. The Solution (the easy part) ---------------------------
> I believe it will suffice to:
> - provide the functions put() and write() (both filter member functions and the
> free functions with the same names) with a way to indicate that fewer than the
> requested number of characters have been written to the underlying data sink
> even though no error has occurred.
> - Provide the functions get() and read() (both filter member functions and the
> free functions with the same names) with a way to indicate that fewer than the
> requested number of characters have been read from the underlying data source,
> even though no error has occurred and EOF has not been reached.

Reasonable notions.

> This is easily achieved for put() and write(), and almost as easily for read():
> - Instead of returning void, put() can return a bool indicating whether the
> given character was successfully written.

Clean enough.

> - Instead of returning void, write() can return an integer indicating the number
> of characters written.

But it also needs to indicate errors.

> - Currently, when read returns fewer characters than the requested amount it is
> treated as an EOF indication. Instead, we can allow read to return the actual
> number of characters read, and reserve -1 to indicate EOF, since it is not
> needed as an error indication.

Clean enough.

> III. The Solution (the ugly part) ---------------------------
> The function get presents more of a challenge. Currently it looks like this (for
> char_type == char):
> struct my_input_filter : input_filter {
> template<typename Source>
> int get(Source& src);
> };
> The return type already serves a dual purpose: it can store a character or an
> EOF indication. Unfortunately, with non-blocking or async i/o there are now
> three possible results of a call to get:
> 1. A character is successfully retrieved.
> 2. The end of the stream has been reached.
> 3. No characters are currently available, but more may be available later.


> My preferred solution is to have get() return an instance of a specialization of
> a class template basic_character which can hold a character, an EOF indication
> or a temporary failure indication:
> template<typename Ch>
> class basic_character {
> public:
> basic_character(Ch c);
> operator Ch () const;
> bool good() const;
> bool eof() const;
> bool fail() const;
> };
> typedef basic_character<char> character;
> typedef basic_character<wchar_t> wcharacter;
> character eof(); // returns an EOF indication
> character fail(); // returns a temporary failure indication.
> wcharater weof();
> wcharater wfail();
> [Omitted: templated versions of eof and fail]


> IV. Examples (feel free to skip) ---------------------------

[snipped async-enabled examples grow from synchronous versions]

> V. Problems ----------------------------
> 1. Harder to learn. Currently the function get and the concept InputFilter are
> very easy to explain. I'm afraid having to understand the basic_character
> template before learning these functions will discourage people from using the
> library.

The class template is hardly complicated. I can't imagine it
would be a show stopper, though it does add some complexity.

> 2. Harder to use. Having to check for eof and fail make writing simple filters,
> like the above, slightly harder. I'm worried that the effect on more complex
> filters may be even worse. This applies not just to get, but to the other
> functions as well, since their returns values will require more careful
> examination.

That's a real issue.

> 3. Performance. It's possible that the change will have a negative effect on
> performance. I was planning to implement it and then perform careful
> measurements, but I have run out of time for this. I think the effect will be
> slight.

I'd expect the impact to be small, but quantifying it would be

> VI. Benefits ------------------------
> A positive side-effect of this change would be that I can rename the filter
> concepts
> InputFilter --> PullFilter
> OutputFilter --> PushFilter
> and allow both types of filter to be added either to input or to output streams.
> Filter writers could then choose the filter concept which best expressed the
> filtering algorithm without worrying whether it will be used for input or
> output.


> VII. Alternatives.
> 1. Adopt the convention that read() always blocks until at least one character
> is available, and that get() always blocks. This would give up much of the
> advantage of non-blocking and async i/o.

Definitely not a good idea.

> 2. Add new non-blocking filter concepts, but hide them in the "advanced" section
> of the library. All the library-provided filters would be non-blocking, and
> users would be encouraged, but not required, to write non-blocking filters.

I like this better, but I wonder if there is a unified approach
within the library that still keeps things tidy for those writing
synchronous code. Could the library components recognize whether
a filter was written using char/wchar_t versus basic_character
and deal synchronously or asynchronously as a result.

That way, those writing synchronous code can write it using a
character type of char/wchar_t and everything to/from that code
will be in the simplified, synchronous style you currently
offer. However, if the client takes advantage of the advanced,
asynchronous capabilities of the library by using
basic_character, then the style changes based upon needing to
deal with the EAGAIN condition.

To keep the two styles as similar as possible, you might need to
alter the current interfaces slightly, but probably not much
(pure, abstract speculation; I haven't even tried to validate
that assertion).

(The member functions good(), eof(), and fail() on
basic_character could be made non-member functions and could be
implemented for char and wchar_t. That would help code deal with
synchronous code in the asynchronous style.)

Rob Stewart                           stewart_at_[hidden]
Software Engineer           
Susquehanna International Group, LLP  using std::disclaimer;

Boost list run by bdawes at, gregod at, cpdaniel at, john at