Boost logo

Boost :

From: Rob Stewart (stewart_at_[hidden])
Date: 2004-09-15 11:50:39


From: "Jonathan Turkanis" <technews_at_[hidden]>
> "Rob Stewart" <stewart_at_[hidden]> wrote in message:
> > From: "Jonathan Turkanis" <technews_at_[hidden]>:
> > > "Rob Stewart" <stewart_at_[hidden]> wrote in message:
>
> > If both remain, then each needs more
> > information and rationale so users understand which to choose for
> > a given use case.
>
> This should be the explanation:
>
> "If you have an existing streambuf implementation and you can't or don't
> want to reimplement it as a Resource, use streambuf_wrapping.hpp;
> otherwise, you should probably reimplement is as a Resource and and use
> stream_facade."

I'm not sure Daryle would agree with that. Anyway, my point is
that if Boost accepts both libraries, then the two of you need to
determine the synergies and differences between your libraries
and ensure users understand the value of each approach.

> > > > "Peekable" does not imply being able to put back a character.
> > >
> > > I know. The natural choice would have been Putbackable, which sounds
> terrible. I
> > > choice Peekable because if you can put back a character than you can peek
> ahead
> > > in the stream by reading a character and then putting it back.
> > >
> > > > Wouldn't "Undoable" be a better name for this optional behavior?
> > >
> > > It sounds a bit to general. For instance, an Undoable Sink might allow
> > > characters already written to be cancelled.
> >
> > Many applications have only one level of undo and don't allow
> > everything to be undone. Consequently, I don't think this is
> > much of a problem. How about "revertable?"
>
> This still sounds too general. Maybe 'PutbackResource'?

That, of course, doesn't follow the "able" convention you've
established. Otherwise, it does get right to the point clearly.

> > > > There's no information provided on the performance of this
> > > > library. How much overhead does it add compared to custom
> > > > solutions?
> > >
> > > Performance comparable to hand-written components is listed as one of the
> main
> > > design goals. (Rationale-->Design Goals-->Item 3, at
> http://tinyurl.com/3z9ou).
> > > The footnote to that item (http://tinyurl.com/4th3v) mentions some
> performace
> > > comparisons I performed.
> >
> > The information, as I recall, wasn't rigorous, or at least not
> > reported that way.
>
> It was fairly rigorous, but not reported that way. I didn't finish the project
> and publich the results because
>
> - It was taking a long time :-)
> - I couldn't claim that my tests represented typical use cases
> - Initial results satisfied me that at most some fine-tuning would be required;
> no major overhaul was needed.

Perfectly reasonable.

> > > > The filter chain ordering for input versus output is awkward. It
> > > > should always be from the beginning to end; let the library
> > > > reverse the order.
> > >
> > > I'm surprised nobody mentioned this earlier.
> > >
> > > Having the resource at the end dramatically simplifies the interface. It
> allows
> > > an instance filtering_stream of filtering streambuf to be considered 'open'
> as
> > > soon as a resource is pushed, and 'closed' as soon as it is popped. This
> allows
> > > a simple stack-like interface. The alternative would be
> >
> > I understand.
> >
> > > (i) to add open/close functions (which are part of the interface of
> > > streambuf_facade, but would currently be redundant for filtering streams, OR
> >
> > I don't think this would be a problem, but I don't see -- and I'm
> > not going to go look right now, sorry -- why it is redundant for
> > filtering streams. Also, couldn't the function be called
> > "complete" or "add_resource?"
>
> It's redundant because adding a resource is currently the equivalent of 'open',
> and popping a resource is the equivalent of 'close'.

Gotcha.

> > > In addition, allowing filters to be pushed after a resource would give many
> new
> > > users the impression that they can add filters *after* i/o is in progress.
> As
> > > has been discussed during the review, this is not currently supported;
> support
> > > can be added in limited circumstances, but not generally.
> > >
> > > Consider:
> > >
> > > filtering_ostream out;
> > > out.push(file_sink("log"));
> > > out.push(base_64_encoder());
> > > out << "hello world!\n"; // stream is implicity 'open'
> > > out.push(zlib_compressor()); // error!
> >
> > This won't be a problem with complete() or add_resource().
>
> If you mean that the above should be rewritten
>
> filtering_ostream out;
> out.push(file_sink("log"));
> out.complete(base_64_encoder());
> out << "hello world!\n";
> out.push(zlib_compressor()); // error!
>
> you may be right that users would be less likely to make this mistake. I don't

Yes.

> see how add_resource would help at all.

Because "add_resource" was offered as a synonym for "complete."

> I believe the current stack-like interface is elegant and intuitive. Reversing
> the order will also be confusing if I adopt JC van Winkel's pipe notation, which
> I plan to do. If I adopt both changes, the following would be equivalent:
>
> filtering_ostream out;
> out.push(file_sink("log"));
> out.push(base_64_encoder());
> out.complete(newline_filter(newline::windows));
>
> ---
>
> filtering_ostream out(
> newline_filter(newline::windows) |
> base_64_encoder() |
> file_sink("log") );

The first example is using the proposed, new syntax, so I'd
prefer to see it written like this:

   filtering_ostream out;
   out.push(base_64_encoder());
   out.push(file_sink("log"));
   out.complete(newline_filter(newline::windows));

Then, the second, which is confusing as written, should be:

   filtering_ostream out(
      base_64_encoder() |
      file_sink("log") |
      newline_filter(newline::windows));

Then, the two are quite similar.

> > > > _______________________________
> > > > basic_newline_filter
> > > >
> > > > I realize that this is only an example, but there is no apparent
> > > > protection from misconfiguring the constructor flags. Grouping
> > > > the related options into enumerated types and taking separate
> > > > parameters for each group would make it safer to use. (You can
> > > > overload the bitwise OR operator to permit combining them
> > > > conveniently.) That is, since print_CR, print_LF, and print_CRLF
> > > > are mutually exclusive, they should be part of an enumerated type
> > > > that does not permit bitwise OR'ing. Since the posix, mac, and
> > > > windows values are meant to supplant all of the other values,
> > > > they should be part of their own enumerated type and should be
> > > > used in a separate constructor. The remaining options can be
> > > > part of a third enumerated type, with bitwise OR support, that
> > > > constitutes the second argument to the existing constructor.
> > >
> > > I was trying to keep things simple. Specifying more than one of print_CR,
> > > print_LF and print_CRLF yeilds a runtime error. Even if print_xxx had it's
> own
> > > enumeration type, it would still be possible to specify illegal values.
> >
> > If no operators are defined for the enumerated type, then one
> > must go out of one's way to provide illegal values.
>
> Under your proposal, would a typical construction of a newline_filter look like
> this:
>
> newline_filter(write_CR, accept_LF | accept_CR | accept_CRLF )
>
> instead of
>
> newline_filter(write_CR | accept_LF | accept_CR | accept_CRLF )

Yes.

> > > > _________________________________________________________________
> > > > boost::io::reverse
>
> > > > That is, why isn't a filter's interface based upon these
> > > > semantics:
> > > >
> > > > char_type filter(char_type ch);
> > > > streamsize filter(char const * input, streamsize n,
> > > > char const * output);
>
> > > The problem with making symmetric filters the default (or exclsuive) filter
> > > concept is that it's unexpectedly difficult to implement them correctly.
>
> > However, I'd like more
> > information on why filters are difficult to write
> > bidirectionally. The I/F I mention above seems quite simple.
>
> Neither of your suggested interfaces is sufficient. The first one allows only
> character-for-character substitutions. The second, depending on the

I presume, then, that this would work:

   boost::optional<char_type>
   filter(char_type ch);

> interpretation of the return value, needs to be augmented to indicate how many
> characters of the input sequence or the output sequence were consumed. It's

Easily solved.

> somtimes necessary, e.g., to achieve a good compression ratio, to allow
> symmetric filters to output fewer characters than possible. In that case, one
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I'd like to see that!

> needs a boolean parameter to instruct the filter to flush it's buffers.

I don't quite understand your point, but that's immaterial. It
sounds like something like this would work:

   std::pair<streamsize, streamsize>
   filter(char const * input, streamsize n,
      char const * output);

Provided those interfaces are close, wouldn't this make writing
symmetric filters easier?

> As an example of the difficulty of writing symmetric filters, look at
> http://tinyurl.com/6bu23. It took my two hours to get the
> toupper_symmetric_filter to work properly. (Note that I added the internal
> buffer just to simulate a realistic use case.)

It looks like the difficulty stemmed from having to keep track of
whether there are more characters available on the input and
whether writing to the output was still possible. Isn't this an
indictment of the current approach?

What I'm suggesting is that your library take care of reading and
writing data, and let the filters only be concerned with
providing the conversion of the input to some output. (I don't
know the internals of your library, and I'm probably missing
something key, but it seems as if some simplification should be
possible.)

> > > There are several choices for this type of passage:
> > >
> > > 1. Use the passive voice everywhere.
> > > 2. Use 'we' -- this sounds natural to me because it's used in mathematical
> > > papers.
> > > 3. Use 'you'
> > > 4. Use 'the user'
> > >
> > > Which should one prefer? Settling on one of the above and using it
> consistently
> > > should resolve many of your (snipped) suggestions/objections below.
> >
> > I use we in comments (design notes, and such), and you in
> > tutorial type text. After all, I'm teach "you," the reader in
                                        ^^^^^
teaching
> > such text. I think you should write the docs as if you are
> > writing to a single reader. That makes it easiest to get things
> > consistent.
>
> What about in the ordinary case (not comments, not tutorials)?

What is an "ordinary" case? Personal correspondence? Scientific
report? Essay on the current geopolitical state of the world?

> > > > "SeekableResource" refers to a read/write "head," but few, if
> > > > any, Resources for which this library will be used are physical
> > > > devices with read/write heads. I suggest using "location,"
> > > > "position," or "offset" in lieu of "head" here and in other parts
> > > > of the documentation.
> > >
> > > How about 'stream position indicator'?
> >
> > Why "indicator?"
>
> Substituting 'stream position' for 'reading head', etc., yields some funy stuff
> like:
>
> "Modes can be categorized in several ways ... 3. Whether the reading or
> writing stream positions are repositionable, and if so, whether there are
> separate stream positions for reading and writing or a single read/write
> stream positions."
>
> "Seekable: a single sequence of characters, for input and output, with
> a combined repositionable read/write stream position."
>
> How would you phrase this stuff?

I suggested "location" as a way around "repositionable
positions." Nevertheless:

   Modes can be categorized in several ways...reading or writing
   stream positions are Seekable and, if so, whether there....

and:

   Seekable: a single sequence of characters for input and
   output, with a common read/write stream position that can be
   moved to different parts of the sequence.

> > > > In "Direct," you refer to a socket-like interface, but not
> > > > everyone will understand what you mean. I suggest using the
> > > > phrase, "function-based interface."
> > >
> > > I see your point, but Direct Resources also have a function based interface.
> >
> > Well, you need something more specific and plain than
> > "socket-like interface."
>
> How about:
>
> "A resource is Direct if it provides access to its controlled sequences
> as in-memory arrays rather incrementally using functions such as read
> or write."

Bingo.

> > > > _______________________________
> > > > The Metafunction mode
> > > >
> > > > You need to include an example and the signature of mode so we
> > > > know how to use it.
> > >
> > > This section contains a link to the reference docs for mode:
> > > http://tinyurl.com/5oqsy. Wouldn't it suffice to provide an example there?
> >
> > I did see mode.html later, but I don't think I noticed the link
> > when I was looking at this page. An example on that other page
> > is the right place, but maybe you can call attention to the link?
>
> See "The Class Template Mode" for example usage.

Yep.

> > > > ______
> > > > Synopsis
> > > >
> > > > Using "T" as the name of a policy class template parameter is
> > > > confusing. "T" is, of course, commonly used for a data type
> > > > stored in a container and for similar purposes. The closest
> > > > thing to "T" in streambuf_facade is the "Ch" nested type in the
> > > > policy class. Change "T" to a more meaningful name.
> > >
> > > I adopted this convention from "C++ Templates", p. 10. FilterOrResource
> would be
> > > more decriptive, but it's too long. What would you suggest?
> >
> > You need a supercategory of Filter and Resource. "Component?"
>
> The trouble is I want the concept names to be unique not just within the library
> but in a wider context, including the standard library and the rest of Boost. So
> the concept name should have IO in it somewhere.

Neither "Filter" nor "Resource" contain "IO." I think you're
alluding to the generality of "Component" but isn't that a
problem for "Resource," too?

You can add an "IO" prefix, if you like, but offhand, I can't
think of anything better.

-- 
Rob Stewart                           stewart_at_[hidden]
Software Engineer                     http://www.sig.com
Susquehanna International Group, LLP  using std::disclaimer;

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk