Boost logo

Boost :

From: Rob Stewart (stewart_at_[hidden])
Date: 2005-03-10 10:47:40


From: "Jonathan Turkanis" <technews_at_[hidden]>

I just noticed that I didn't reply to this.

> Rob Stewart wrote:
> > From: "Jonathan Turkanis" <technews_at_[hidden]>
>
> >> If the library is well-received, however, I expect eventually there
> >> will be a large body of user-defined filters, and I wouldn't want
> >> them to have to be rewritten.
> >
> > Given the filters written with sync I/O in mind are quite
> > different from those written for async I/O, is that really an
> > issue?
>
> Do you mean that different filtering operations will be desired, or that the
> implementation will be different?

I meant that the implementation would differ.

> My hope was to make the implementation of non-blocking filters easy enough that
> I could require all filters to be non-blocking. Maybe my original message was
> unclear on this point.

OK.

> >> (This is another topic that didn't get much attention during review.
> >> There are some, notably James Kanze, who insist that well-designed
> >> stream buffers should not throw exceptions. My defense of exceptions
> >> is here: http://tinyurl.com/59xv6. At the time of the review, I was
> >> prepared to switch to error codes, but it's a bit late now.)
> >
> > I apparently missed that during the review. On what basis does
> > James Kanze make his assertion?
>
> See this message by James Kanze, my reply, his reply and my further reply:
> http://tinyurl.com/4tkj3. I don't find his argument very convincing on this
> point. But he is very knowledgable, so I'm reluctant to dismiss his view.

In the end, I think James conceded that exceptions for unusual
things were needed. It's the common events that should be
handled by exceptions:

> > So long as EOF and EAGAIN do not result in exceptions,
>
> right
>
> > given how commonly they occur, I have no
> > problems with exceptions. However, it sounds like you might be
> > throwing an exception to indicate EOF. If so, I don't like it.
>
> How did I give you that impression?

You're not, so it isn't important.

> > It is not entirely uncommon to read to EOF, do something to a
> > file, and continue reading. For example, we have certain files
> > that we tail throughout the day in our production apps. EOF is a
> > common occurence because we quickly process the new data appended
> > to the files and reach EOF again. That is, the sink is faster
> > than the source. This behavior would be complicated by EOF
> > throwing an exception. An error indication of EOF, plus the
> > ability to clear that condition and try again, fits this usage
> > better. (Granted, that use of a file is not widespread, but
> > since you offer no choice, such usage is not possible.)
> > You mentioned in the cited documentation section that you didn't
> > want to complicate things such that users would wonder which of
> > the various ways a given function indicates errors. I
> > sympathize, but isn't that what you're introducing with your
> > basic_character idea?
>
> I'm not treating EOF or EAGAIN as errors. What I'm trying to avoid is having a
> single error (or any other type of condition, for that matter) represented in
> more than one way.

Good.

> > How about putting a state variable in filters, sources, and sinks
> > to indicate EAGAIN, EOF, and GOOD. Then, functions that only
> > read or write a single character can return a bool, and functions
> > that read or write multiple characters can return a count. If
> > the bool is false or the count is zero, the code can query the
> > state to determine whether EAGAIN or EOF occurred. That keeps
> > the error indication in the return value simple and even leaves
> > room for adding additional error states should that prove
> > necessary. Hard errors can still be signaled via exception.
>
> This was one of the options I seriously considered. I believe the very first
> version of the library I posted had this feature. I tend to think it would make
> life harder for people writing filters and devices to force them to implement an
> additional function. As things stand, you can often get by with a implementing a
> single function, and it works pretty well. Also, it makes it harder to fit
> standard streams and stream bufferss into the framework: stream buffers don't
> have a state indicator at all; streams do, but it doesn't map well to
> good/eof/eagain.

Excellent points. I think your current direction is appropriate
in light of these things.

> >>>That would help code deal with
> >>> synchronous code in the asynchronous style.)
> >>
> >> I guess I could implement
> >>
> >> bool eof(int n) { return n == EOF; }
> >> bool good(int n) { return n != EOF; }
> >>
> >> bool weof(std::char_traits<wchar_t>::int_type n)
> >> { return n == WEOF; }
> >> bool wgood(std::char_traits<wchar_t>::int_type n)
> >> { return n != WEOF; }
> >>
> >> I tend to think that n == EOF and n == WEOF are more readable,
> >> however.
> >
> > Yes, they are more readable, but if you don't document the
> > implementation of those functions, library users will expect to
> > use them in all situations. Thus, switching to async from sync
> > I/O will be less of a stylistic change. (That also leaves room
> > for you to change the implementation details if need be.)
>
> Let me make sure we're on the same page: We're assuming I decide to provide
> separate concepts for non-blocking i/o, and that some of the non-blocking
> concepts use basic_character. Are you saying that someone who is used to writing
> non-blocking filters and tries to write a blocking filter may get confused
> because eof() doesn't work with int?
>
> Maybe I could have get() return a basic_character even in the blocking case, and
> explain that there's no need to test for eagain in blocking i/o. This would
> eliminate one of the last traces of char_traits in the public interface of the
> library, which would be good. The other reasonable way to eliminate char_traits
> would be to have blocking get() return an optional<char> (see my reply to
> Thorsten); but if I'm using basic_character for non-blocking i/o, it may make
> sense to use it in both places.

I meant that if the library user only knows to write "eof(c),"
and doesn't know that "c == EOF" is an alternative spelling
because you don't document it, then the user will only ever write
"eof(c)" whether they are writing a blocking or non-blocking
filter.

Whether blocking filters traffic char_traits, while non-blocking
uses basic_character is a separate question, but I think the
answer should be obvious: use basic_character in both places.
Then, library users can remain oblivious to char_traits if they
don't already know about it.

> > Of course, if you adopt my state idea, then the return value
> > would simply indicate that things went well or something less
> > than ideal happened. In the latter case, one would query the
> > state of the filter/device to learn what happened. That would
> > obviate your basic_character class and these functions.
>
> > One problem with code that returns status information is that
> > folks can forget to check the status.
>
> This is a real problem with eagain, since testing may not reveal the error
> unless eagain happens to occur at the right place in a character sequence.

You'll need special test sources and sinks that can be configured
to produce "would_block" after N calls.

> > You could return a special
> > type in lieu of bool, and basic_character in lieu of char_type,
> > which require inspecting whether they indicate an error
> > condition. If that query is not done, the destructor can
> > complain.
>
> Could you elaborate? It sounds like it could lead to very poor performance.

I just mean that querying the status sets a flag in the object
that the dtor checks. If the flag wasn't set, then the dtor
complains. The complaint code could be an assertion or maybe a
message printed on stderr. You can even conditionally compile
away the checks.

-- 
Rob Stewart                           stewart_at_[hidden]
Software Engineer                     http://www.sig.com
Susquehanna International Group, LLP  using std::disclaimer;

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk