Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2005-01-11 22:56:05


Hi Christopher,

I think I haven't made my main point clear, so let me try to rephrase it.

Some filtering operations are easier to express using the InputFilter concept
than the OutputFilter concept, and vice versa. For example, the tab-expanding
filter

http://home.comcast.net/~jturkanis/iostreams/libs/iostreams/doc/?path=4.1

is easier to express as an output filter, since whenever it encounters a tab
character it can simply write the appropriate number of space characters to the
provided Sink. If it were an InputFilter, whenever it read a tab character from
the provided Source it would have to return a single space character and then
record the number of space characters to return on subsequent calls to get().

Similarly, whenever a new filter concept is added, I'd like to have a concrete,
real-world example (preferably several examples) of a filtering operation which
is easier to express using to the new concept than using any of the existing
concepts. I was hoping you would be able to say something like this:

 "Considering the XXX filter; look how easy it is to express using a co_filter:

     [code]

Now look how hard is to express it as an InputFilter

     [code]

or as an OutputFilter

    [code]

Therefore, adding direct support for the co_filter will make the iostreams
library easier to use."

I'm getting the feeling, however, that you think the existing concepts are
simply too hard to understand. Is this correct? Do you think more examples or
tutorial material might help?

christopher diggins wrote:
> ----- Original Message -----
> From: "Jonathan Turkanis" <technews_at_[hidden]>
>>> void DoWork() {
>>> // do work
>>> }
>>>
>>> int main() {
>>> DoWork();
>>> }
>>
>> Okay, but do you really expect people to start writing programs this
>> way?
>
> Yes, especially if the iostreams library provides the functionality
> to them. My point is that C++ code is pointlessly hard to reuse as
> is, and I am pushing for new ways to make small programs more
> reusable. This is incrediby important when managing large numbers of
> small programs (for instance library tests and demos).

I agree that reuse is critical. The iostreams library was designed to allow
developers to create highly reusable components.

> It is trivial to
> refactor code to make it look like the above, just cut and paste the main!

If you can convince me that there are lots of existing programs lying around
which can be transformed into co_filters by modifying a few lines, I'd be
inclined to add co_filters to the library. One thing I'd like to know is why
system_filters wouldn't be an acceptable vehicle for reuse in that case.

>> I think people would only do this to conform to your filter concept. My
>> question is: why aren't the other concepts sufficient?
>
> The other concepts are fine, they are just more obfuscated than most
> programmers require. Just imagine trying to explain how to use a
> filter concept in a way which makes sense to a Java / Delphi / C
> programmer.

Actually, the InputFilter and OutputFilter concepts are very similar to the
classes FilterInputStream and FilterOutputStream from java.io. The main
difference is that the Java classes store a reference to the downstream Device
as member data while models of the Boost concepts are passed the downstream
Devices as function arguments. This was done to allow the exact type of the
downstream Device to vary, to prevent user-defined filters from having to derive
from specific classes, and to shield the user from managing the lifetime of the
downstream Device.

> I think it is important to try and provide alternatives where possible
> which makes sense to professional programmers who may not be familiar
> with the intricacies of generic programming techniques and functors.

My hope is that users who do not feel comfortable reading the semi-formal
concept specifications will be able to learn to use the library quickly by
studying the examples. I intend to provide many additional examples in the final
documentation.

>>> By taking this one simple step, a person's code could then be easily
>>> reused. This does overlook the fact that most filter programs take
>>> parameters which is trivially remedied.
>>
>> Note that none of the other concepts has this problem.
>
> Noted. I think that all of the concepts should be supported! I simply
> arguing the case for void(*)() as a valid concept for now.

I don't think I ever said it was invalid.

>> I guess I should have asked for a *realistic* example. If you really
>> write such simple programs you don't have to worry about reuse; it's
>> simpler to write the whole program again from scratch.
>
> First off I do write programs as simple as that and I have a lot of
> them. This occurs frequently for testing, prototypes, demos, and
> systems admin. I strongly disagree with maintaining multiple code
> bases, rather than refactoring and reusing the code. As a
> professional coder I am always looking for ways to be more productive
> and and have less code to manage.

To me, if you want to be able to reuse code which converts to uppercase, simply
write an uppercase filter; e.g.:

   struct toupper_filter : input_filter {
        template<typename Source>
        int get(Source& src) { return toupper(boost::io::get(src)); }
   };

> Nonetheless, I do currently have a non-trivial program which converts
> C++ into a <pre></pre> html tag, CppToHtmlPreTag, it operates
> obviously on the stdin and outputs to stdout. It looks essentially
> like this:
>
> void CppToHtmlPreTag() {
> // calls multiple other functions to do the work
> };
>
> int main() {
> CppToHtmlPreTag();
> return 0;
> }
>
> I want to reuse this program in another program which outputs an
> entire Html Docucment with a header and footer. ( CppToHtmlDoc ). The
> easiest way I can think of to do this is to write a new program such
> as (this is to a certain degree psuedo-code):
>
> struct CppToHtmlDoc {
> CppToHtmlDoc(string css, string title) : mCss(css), mTitle(title);
> void filter() {
> cout << "<html><head><title>" << mTitle;
> cout << "</title><link rel='stylesheet' type='text/css' href='";
> cout << mCss << "'/><body>"
> cin | CppToHtmlPreTag();
> cout << "</body></html>";
> }
> string mCss;
> string mTitle
> }
>
> int main(int argc, char** argv) {
> assert(argc == 4);
> CppToHtmlDoc(argv[1], argv[2]) | filestream(argv[3]);
> }
>
> So I wrote program2 using the [b] approach you outlined which I agree
> that it is superior for this program. I also managed to retain my original
> code precisely as is using the [a] approach. If I had to rewrite the
> original program to use a filter concept I would have had to rewrite
> *all* of my functions to pass the the Source and Sink types to each
> one, and to use src and snk instead of cin / cout.

I'm starting to believe that having to pass the Source or Sink (or both) as
function arguments is what you find problematic, but
I don't follow the entire discussion. Why would [c] require you to rewrite each
function to take Source and Sink parameters, but not [b]?

> I guess my point here is that I am able to refactor existing code more
> easily and quickly if you support [a] and [b] syntax. [c] is perfectly
> acceptable, and has its advantages in several scenarios, even though
> it is overkill for my work.

If [c] is perfectly acceptable, then it can't be passing the Source and Sink as
function arguments which is troubling you. As a result, I don't see why it is
easier to express CppToHtmlDoc as a co_filter than as one of the other types of
filters. You omitted the implementation, which is where the difference would
presumably show itself.

>>>> The expression
>>>>
>>>> filter1() | filter2() [A]
>>>
>>> Sorry, I thought it was a statement. Wouldn't it be useful to also
>>> allow one liners with a separate syntax:
>>>
>>> source() > filter1() > filter2() > sink();
>>
>> As I mentioned in a previous message, I think this is a good idea,
>> if it uses
>> the pipe notation. I don't see why a different operator should be
>> used.
>
> I just want to be able to write:
>
> filter1() | filter2();
>
> as a statement, with the implicit understanding it pumps from cin and
> to cout. But you told me I can't have that, so I am offering ">" as a
> possible work-around.

Okay. But wouldn't the main use case be writing command-line filters? In that
case, the bulk of the main function could be written:

     filtering_ostream out(filter1() | filter2() | ref(cout));
     copy(std::cin, out);

Is that really so hard? (I'll eliminate the need for "ref()" shortly.)

> Implementation aside, don't you agree that the
> POV of an end-user isn't it obvious that the above statement should
> be equivalent to:
>
> cin | filter1() | filter2() | cout;

No.

> I see that this conflicts with the current meaning of filter1() |
> filter2(), so I would propose that instead that could be rewritten as
> filter1() + filter2(). As an end-user I expect a | to have executed
> the data pumping by the end of the statement. If it does it sometimes
> (i.e. source() | filter() > sink(); statements) but not other times (i.e.
filter1() | filter2()
> expressions) then these are two separate and conflicting meanings of
> filter > filter, which I am not comfortable with.

You were the one who suggested the first meaning. Now you say you are not
comfortable with it because it conflicts with the second (pre-existing) meaning!

Best Regards,
Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk