Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2004-09-03 21:37:20


"Pavel Vozenilek" <pavel_vozenilek_at_[hidden]> wrote in message
news:chade8$n8k$1_at_sea.gmane.org...
>
> > start of the formal review of the Iostreams
> > library by Jonathan Turkanis.
> >
> >
> I vote to include Iostreams into Boost with both hands.
> It is prime example of professional work.

Thanks!

> Looking back on MoreIo library I recommend:
> - to prefere Iostreams for all overlapping functionality
> - to move the applicable rest from MoreIo into Iostreams
> with:
> - naming conventions taken from Iostreams
> - documentation look and feel from Iostreams

I assume you view null_buf, pointerbuf, etc. as overlapping functionality. What
about 'streambuf wrapping'?

> For user it is always better to have one library
> with consistent structure/naming/documentation
> than to have many of them. This is hard earned
> experience.

Yes.

> As the Iostreams is large, I didn't yet read reference
> docs and test code. I plan it and will post rest
> of review later.

I look forward to it. Note that the end date for this review is not yet fixed.

> Current review notes are bellow.
>
>
> To deal with its complexity I recommend to add many
> more overcommented code snippets into documentation
> and add hyperlinks to relevant examples on every fitting place.
> Pictures, flowcharts and diagrams would help too.
> Some information would look better in tabular form.
> (Details bellow).

Yes; the review so far has made this very clear to me.

> __________________________________________________________________
> 1. overview docs:
>
> The first occurence of terms "data source"
> and "sink" should be hyperlinked (it is but only later
> in text). Dtto "filters".

My reason for not hyperlinking the first occurence was my feeling that in the
first two paragraphs I was simply using them as 'common sense' terms, but that
when I used the terms in the section marked 'Concepts' I was formally
introducing them as technical terms defined by the library. Obviously this was
not very clear.

> It may be also useful to put there some diagram. In addition
> link to short code example could be added to the first
> occurence for each term.

Good idea. I've become convinced of the need to add explanatory material right
at the beginning.

> The enumeration of components as 'access to memory-mapped files'
> may look better in <ul>...</ul> list.

Okay.

> Sentence: "Sources, Sinks and their refinements are called
> resources."
>
> The term is overused in programming. Maybe "data resources"
> or "data-module" or so would be better.

I know that 'resource' is a loaded term ;-) but I couldn't think of anything
better. 'Data-module' doesn't seem very descriptive. Unfortunately, the standard
term seems to be "Source/Sink" -- which I find ugly.

> Relation between sins/sources/filters and streams should be
> explained immediatelly. Possibly using picture.

Okay. What do you think of the pictures here: http://tinyurl.com/5v2ak? Too much
detail for the introduction?

> Link to "adapters.html" is broken.

Which link?

> The links to iterator_facade and Iterators library should
> be local.

I'm planning to make all the boost links local eventually. I wanted them to work
properly for people viewing the docs from the zipped package.

> __________________________________________________________________
> 2. tutorial docs:
>
> The first code example:
> s[z] = (char) (rand() * 256 / RAND_MAX);
> I think this causes overflows way too often.
> s[z] = (char) (rand() % 256);
> is more likely intended even if it
> adds small bias.

Thanks.

> In example:
> in.open(634875);
> also stream example should be added,
> just to show how it is written.

I don't understand.

> streambuf_facade and stream_facade should be hyperlinked.
> (I am fan of hyperlinking everything everywhere.)

I tried to be liberal with my hyperlinks. Sometimes I find it annoying when the
same word is hyperlinked several times in a sentence. But you're right, in the
tutorial streambuf_facade and stream_facade definitely need to link the user's
guide or the reference.

> In code example line:
> streambuf_facade<vector_sink> out(v); // error!!
> it should be sais if it is compile time error,
> assertion or whether user needs to check it manually.
>
> I wonder it is is possible to workaround this limitation
> internally in library by const_casts/whatever nasty.

These are the standard forwarding problem trade offs. The one thing I defintely
want to avoid is allowing const references to bind to temporary stream objects:

     filtering_istream in(stringstream("hello"));

This would immediately lead to disaster.

> Sentence:
> "... and modifies it before returning it to the user."
> should be
> "... and modifies or eliminates it before returning it to the user."

Isn't elimination a type of modification?

> In example:
>
> filtering_streambuf<input> in;
> in.push(alphabetic_input_filter());
> in.push(cin);
>
> It should be explained what is "input" and where
> does it come from.

You're right. To clarify, input is a tag struct representing the i/o mode
'input'.
So, filtering_streambuf<input> is a read-only filtering stream, synonymous with
filtering_istream, and filtering_streambuf<output> is a write-only filtering
stream, synonymous with filtering_ostream

> Shouldn't it have more expressive
> name e.g. input_filter_base? Maybe the source/sink too.

I don't follow this.

> It is a bit confusing to see push() of both input_filter
> descendant and source descendant. Should be noted,
> maybe with class diagram.

Others have said this too. Originally I had something like push_filter and
push_resource. The i/o categories made this unnecessary. I consider the
std::stack-like push/pop/size/empty interface to be much more elegant. After
all, there's just one stack. Internally, also, adding a filter is essentially
the
same as adding a resource.

> All examples use variable names 'in' and 'out'. It may be
> confusing for one just skimming through code snippets.
> Maybe different names (e.g. vector_in) could be used.

Okay.

> It would be also nice to distinguish variables and types
> systematically. E.g. variables having _ suffix.

That's the convention I use for member variables.

> filters.html link is broken.

Which link?

> Generally, tutorial may be sprinkled with few more
> overcommented code snippets. They may be hide-able.
> Other docs won't suffer having them too.

Okay. What do you mean by hideable? I'm hoping not to have to use any javascript
except for the menu.

> __________________________________________________________________
> 2. examples docs:
>
> Link to regex should be local.
>
> Link "Uncommenting Input Filter" is broken.

If you mean the link under "Regex OutputFilter", I think it appears broken only
because there's nowhere left to scroll.

> __________________________________________________________________
> 3. presidential_output_filter.hpp: a nitpick, in code:
>
> if (word == "subliminable")
> return "subliminal";
> else if (word == "nucular")
> ....
>
> the 'else' parts are not needed.

True.

> __________________________________________________________________
> 4. docs: all <img> elements should have their width/height set in HTML

For slow connections, I guess?

> __________________________________________________________________
> 5. concepts docs: all concepts here can be also listen in table,
> sorted and with one line info.
>
> The individual concepts should be added as sub tree into left panel.
> Dtto compression filters, etc.
> Every page should be accessible from left panel.

I'll try to do this, but I might find that the menu is so big it takes too long
to load.

> __________________________________________________________________
> 6. Installation docs: it should be in moredetail described what is
> BOOST_IO_NO_LIB for and who needs it for what.
>
> Maybe there should be list of *.cpp files one needs
> to use for zlib/bzip2/..., for those who do not want
> to use autolink.

All the .cpp files are enumerated here: http://tinyurl.com/6klot.

> __________________________________________________________________
> 7. "Classes by category" docs: each link could have very short
> decription on the left:
>
> array_resouce (xyz...)
> array_sink (abc...)
>
> Similarly classes sorted alphabetically, functions, etc.
> It would help when one is trying quickly locate some info.

Good idea.

> __________________________________________________________________
> 8. All applicable headers should have on top:
>
> #if (defined _MSC_VER) && (_MSC_VER >= 1200)
> # pragma once
> #endif
>
> to keep compilation times for VC and Intel C++ down.

I thought I already made this change.

________________________________________________________________
> 9. Syntax like this was suggested:
>
> filter_stream out(tee(std::cout) | encode | gzip | file("some file"));
>
> Maybe support for this could be built up on existing framework.

I think it's really cool. I've already contacted the (co-)author JC van Winkel,
who gave me permission to use it.

> __________________________________________________________________
> 10. To question:
>
> > do you prefer the names filter_stream and filter_streambuf to
> > filtering_stream and filtering_streambuf?
> >
> The first. Steve McConnell in his Code Complete 2 quotes
> study recommending identifier length 10-16 chars.

Me too.

> __________________________________________________________________
> 11. A collection of useful stream(buffer)s could be added into library,
> so they are handy to users:
>
> - /dev/nul like stream (like the one from more_io)
>
> - tee, tee3, ... etc stream
>
> - random (the one from code snippet)

These are all good. I'm thinking as the collection of provided components grows,
I may need to be more systematic about library organization. So one might have
directories boost/io/filters/compression, etc.

> - repeater (some sequence of characters over and over)
> This one could ideally accept syntax used inn Boost.Assign.
> It would allow to easily create complex but predictable
> test data structures.

Could you elaborate?

> - 'a switcher'. Something what sends data into another stream
> but has method switch_to_other_stream(s). E.g. for circular logs.

Doable, but presents some problems with buffer synchronization. There is
currently no generic sync() function or Synchronizable concept.

> __________________________________________________________________
> 12. One could dream about ability to use lambda expressions directly
> for filters.

Lambda expressions which define function objects which transform characters one
at a time whould be easy. But I suppose you mean lambda expressions representing
complex filtering operations.

Let's see ....

How about this: instead of place-holders _1, _2, etc. we could have _get, _peek,
_put... The following could be an inline version of Kanze's uncommenting input
filter (http://tinyurl.com/6qn3a):

    if_(_get == '#')
    [
       while_(_peek != EOF && _peek != '\n') [ _get ]
    ]
    ,
    _peek

:-)

> __________________________________________________________________
> 13. Possible feature of library (I do not know whether it is
> implementable or implementable efficiently/elegantly):
>
>
> - lets have chain of streams (e.f. buffer/compress/write file)
>
> - someone feeding the head of chain may want to execute an action
> somwhere down in the chain (e.g. switching files).
>
>
> - now it would require 'someone' to know all details of chain
> and to have access points to its parts (=> high coupling)

Right. That's the status quo.

> It would be nice to be able to 'send a command' down the stream
> and have some part of chain picking it up and executing.
> This way sender would not need to know anything about
> current chain structure.
>
> If the command wouldn't be recognizable by any part of stream
> chain it would become nop.
>
>
> Similarly one can imagine events going upstream.
> If anyone handles them: OK, if not they will be ignored.
>
> (They are different from exceptions that ignoring them
> is default and harmless.)

Sounds interesting. I believe this would have to rely on runtime rather than
compile-time polymorphism. I'll think about it.

> Examples of downstream commands:
> - switch file (for logging)
> - flush compressor dictionaries
>
> Examples of upstream events:
> - predetermined file size reached
> - current compression ratio info

> __________________________________________________________________
> 14. May it be possible to add "synchronize panels" button into
> left panel? If clicked, it would expand and highlight item
> in left panel for currently opened page in right panel.

The [link to this page] button already expands the tree to the current page, if
the tree contains a link to the current page. I guess highlighting would be easy
enough to do. I'm afraid some would find this annoying.

> __________________________________________________________________
> 15, cloable.html: it should be explained (+ example) what
> 'notification' here means.

How's this: A Closable filter or resource receives notifications -- via the
function <A HREF='...'>boost::io::close</A> --immediately before a containing
stream or stream buffer is closed.

I defintely need an example here.

> __________________________________________________________________
> 16. The examples section should contain all available examples,
> e.g. compressors. There may be subsection for them.
>
> Maybe something as:
> + Examples
> + Simple Examples
> ....
> + Compressors Examples
> ....
> + ....
>
> It is because many people would not go any futher than here
> and would miss the rest.

Good idea.

Thanks for the detailed examination!

Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk