Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2005-03-04 14:09:26


Rob Stewart wrote:
> From: "Jonathan Turkanis" <technews_at_[hidden]>

>> expect I will have to introduce some new Device concepts. However, I
>> would like to modify the *current* filter concepts so that they will
>> work unchanged when non-blocking and asynchronous devices are
>> introduced.
>
> There's a difference between making the concepts work unchanged
> and making the components of the library work unchanged.

True. I can always fix the library-provided component so that they do the right
thing, even if I have to rely on magic, i.e., on an incestuous relationship with
library internals.

If the library is well-received, however, I expect eventually there will be a
large body of user-defined filters, and I wouldn't want them to have to be
rewritten.

>> II. The Solution (the easy part) ---------------------------
>>
>> I believe it will suffice to:
>>
>> - provide the functions put() and write() (both filter member
>> functions and the free functions with the same names) with a way to
>> indicate that fewer than the requested number of characters have
>> been written to the underlying data sink even though no error has
>> occurred.
>>
>> - Provide the functions get() and read() (both filter member
>> functions and the free functions with the same names) with a way to
>> indicate that fewer than the requested number of characters have
>> been read from the underlying data source, even though no error has
>> occurred and EOF has not been reached.
>
> Reasonable notions.
>
>> This is easily achieved for put() and write(), and almost as easily
>> for read():
>>
>> - Instead of returning void, put() can return a bool indicating
>> whether the given character was successfully written.
>
> Clean enough.

Okay.

>> - Instead of returning void, write() can return an integer
>> indicating the number of characters written.
>
> But it also needs to indicate errors.

Errors are indicated with exceptions.

(This is another topic that didn't get much attention during review. There are
some, notably James Kanze, who insist that well-designed stream buffers should
not throw exceptions. My defense of exceptions is here:
http://tinyurl.com/59xv6. At the time of the review, I was prepared to switch to
error codes, but it's a bit late now.)

>> - Currently, when read returns fewer characters than the requested
>> amount it is treated as an EOF indication. Instead, we can allow
>> read to return the actual number of characters read, and reserve -1
>> to indicate EOF, since it is not needed as an error indication.
>
> Clean enough.

Good, thanks.

>> III. The Solution (the ugly part) ---------------------------

>> The return type already serves a dual purpose: it can store a
>> character or an EOF indication. Unfortunately, with non-blocking or
>> async i/o there are now three possible results of a call to get:
>>
>> 1. A character is successfully retrieved.
>> 2. The end of the stream has been reached.
>> 3. No characters are currently available, but more may be available
>> later.
>
> Right.
>
>> My preferred solution is to have get() return an instance of a
>> specialization of a class template basic_character which can hold a
>> character, an EOF indication or a temporary failure indication:

<snip synopsis of basoc_character>

> OK.

Great, thanks. This is the part I was worried about most.

>> V. Problems ----------------------------
>>
>> 1. Harder to learn. Currently the function get and the concept
>> InputFilter are very easy to explain. I'm afraid having to
>> understand the basic_character template before learning these
>> functions will discourage people from using the library.
>
> The class template is hardly complicated. I can't imagine it
> would be a show stopper, though it does add some complexity.

I'm thinking of decribing it as a replacement for traits_type::int_type.
Compared to using int_type and its family of helper functions (eq_int_type,
eof., not_eof, ...) basic_character is a snap. ;-)

>> 2. Harder to use. Having to check for eof and fail make writing
>> simple filters, like the above, slightly harder. I'm worried that
>> the effect on more complex filters may be even worse. This applies
>> not just to get, but to the other functions as well, since their
>> returns values will require more careful examination.
>
> That's a real issue.

It's become even more clear in this thread, given my botched uncommenting_filter
implementation.

>> 3. Performance. It's possible that the change will have a negative
>> effect on performance. I was planning to implement it and then
>> perform careful measurements, but I have run out of time for this. I
>> think the effect will be slight.
>
> I'd expect the impact to be small, but quantifying it would be helpful.

Agreed.

>> VI. Benefits ------------------------
>>
>> A positive side-effect of this change would be that I can rename the
>> filter concepts
>>
>> InputFilter --> PullFilter
>> OutputFilter --> PushFilter
>>
>> and allow both types of filter to be added either to input or to
>> output streams. Filter writers could then choose the filter concept
>> which best expressed the filtering algorithm without worrying
>> whether it will be used for input or output.
>
> Nice.

I'm thinking if I adopt alternative 2, below, I can still do this. Adding a
blocking PullFilter to an output stream or a blocking PushFilter to an input
stream filter would cause extra copying, which I can warn abouit in the docs.
People who need the highest performance can always write non-blocking filters.

>> VII. Alternatives.
>>
>> 1. Adopt the convention that read() always blocks until at least one
>> character is available, and that get() always blocks. This would
>> give up much of the advantage of non-blocking and async i/o.
>
> Definitely not a good idea.

Agreed.

>> 2. Add new non-blocking filter concepts, but hide them in the
>> "advanced" section of the library. All the library-provided filters
>> would be non-blocking, and users would be encouraged, but not
>> required, to write non-blocking filters.
>
> I like this better, but I wonder if there is a unified approach
> within the library that still keeps things tidy for those writing
> synchronous code. Could the library components recognize whether
> a filter was written using char/wchar_t versus basic_character
> and deal synchronously or asynchronously as a result.

It would work by introducing a new category tag non_blocking_tag, and treating
user-defined components differently depending on whether their io_category is
convertible to non_blocking_tag.

> That way, those writing synchronous code can write it using a
> character type of char/wchar_t and everything to/from that code
> will be in the simplified, synchronous style you currently
> offer. However, if the client takes advantage of the advanced,
> asynchronous capabilities of the library by using
> basic_character, then the style changes based upon needing to
> deal with the EAGAIN condition.

> To keep the two styles as similar as possible, you might need to
> alter the current interfaces slightly, but probably not much
> (pure, abstract speculation; I haven't even tried to validate
> that assertion).

If I adopt this solution, I'll be writing non-blocking versions of all the
library filters and so may be able to judge whether there is a big stylistic
difference, and whether it would be a good idea or even possible to modify the
current concepts.

My inclination is to say that no modification should be made if I support
non-blocking components with a new category tag.

> (The member functions good(), eof(), and fail() on
> basic_character could be made non-member functions and could be
> implemented for char and wchar_t. That would help code deal with
> synchronous code in the asynchronous style.)

I guess I could implement

     bool eof(int n) { return n == EOF; }
     bool good(int n) { return n != EOF; }

     bool weof(std::char_traits<wchar_t>::int_type n)
    { return n == WEOF; }
     bool wgood(std::char_traits<wchar_t>::int_type n)
    { return n != WEOF; }

I tend to think that n == EOF and n == WEOF are more readable, however.

Thanks for your comments!

Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk