Boost logo

Boost :

Subject: Re: [boost] [iostream] Device::read return inconsistency
From: Richard Smith (richard_at_[hidden])
Date: 2009-05-10 11:51:51


eg wrote:

> Richard Smith wrote:
>>
>> The example in the documentation for the Source concept in Boost.Iostreams
>> says:
>>
>> std::streamsize read(char* s, std::streamsize n)
>> {
>> // Read up to n characters from the input
>> // sequence into the buffer s, returning
>> // the number of characters read, or -1
>> // to indicate end-of-sequence.
>> }
>>
>> (The same text appears in 'valid expressions / semantics' table, which is
>> probably more definitive.)
>>
>> What does a return of less than n actually mean?
>
> I think the tutorial example for the multi-char shell comments filter gives
> an example of one such case (c == WOULD_BLOCK).

OK. So the implementation of iostreams::get(Source) assumes
that a return of 0 from Source::read() means that input
would block.

However indirect_streambuf::underflow calls Source::read()
and assumes that a return of 0 means EOF. And I've just
done a brief test to verify this. If 0 is treated as
WOULD_BLOCK, then the following code should spin;
conversely, if 0 is treated as EOF, then it should exit
immediately.

   #include <iostream>
   #include <boost/iostreams/categories.hpp>
   #include <boost/iostreams/stream.hpp>

   namespace io = boost::iostreams;

   struct my_source {
       typedef char char_type;
       typedef io::source_tag category;

       std::streamsize read(char*, std::streamsize)
           { return 0; }
   };

   int main()
   {
       io::stream<my_source> in(( my_source() ));
       int c;
       while ( ( c = in.get() ) != EOF )
         std::cout.put( char(c) );
   }

Running it with boost 1.39, it exits immediately, and adding
some diagnostics makes it clear that read() is called once.

The iostreams::get(Source) code eventually calls the
following code from lines 174-180 of iostreams/read.hpp:

   char_type c;
   std::streamsize amt;
   return (amt = t.read(&c, 1)) == 1 ?
       traits_type::to_int_type(c) :
       amt == -1 ?
           traits_type::eof() :
           traits_type::would_block();

Clealy read() returning 0 would result in WOULD_BLOCK, but
so too would any negative return other than -1. On most
systems WOULD_BLOCK == -2 (though this is dependent on your
C library): is the intention that read() should return
WOULD_BLOCK if it blocks?

However if that's the case, a lot of other code breaks.
Just to take one example, filter/agregate.hpp, lines 120-2
implicitly assumes that -1 is the only negative value that
can be returned. And this is far from alone.

The only other possibility is that the Source concept
leaves a return of 0 (and most probably of anything less
than n) undefined, and that various parts of the library
rely on different interpretations to get their documented
behaviour. And if that was the intention, I sincerely hope
the library would have failed review.

I have several patches that fix this in different ways,
depending on ever behaviour is desired, but in the absence
of documentation specifying the intended behaviour of a
return in the [0, n) range, I'm not sure which I should be
testing and submitting.

Richard


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk