Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2004-09-22 18:11:30


"John Torjo" <john.lists_at_[hidden]> wrote in message
news:414443E0.60008_at_torjo.com...
> Dear boosters,
>
> The FORMAL Review of "Output Formatter" library begins today,
> Sept 12, 2004.

Hi, I hope I didn't miss the deadline.

Let me start by saying that this is a library I know and love (warts and all). I
examined all the source code thouroughly at several points in its development,
helped port the library to several compilers, and even contributed little bits.

So I'm sorry I don't have time to write a really detailed review. The review
came at a bad time for me -- right after the review of my iostreams library. My
review is based on reading (most of) the current thread and on my previous
experience with the library.

I'll reorder the basic review questions:

> 4. What is your evaluation of the potential usefulness of the library?

   Extremely useful.

> 3. What is your evaluation of the documentation?

   I haven't read the current docs, but the last time I read them I found them
insufficient, for reasons many have pointed out. There needs to be a general
introduction explaining the scope of the library, lots of examples, and clear
instuctions on how to extend the library. In order to learn about the library, I
had to ask Reece a lot of questions and read the source.

> 2. What is your evaluation of the implementation?

   The implementation is very good, especially those little parts attributed to
someone named Jonathan ;-)

> 5. Did you try to use the library?

   Yes, I used it extensively when I was porting it to Borland 5.x, Metrowerks,
Comeau, Intel and (with limited success) to VC6.

> 6. How much effort did you put into your evaluation?

    I think I've already answered this one.

> 7. Are you knowledgeable about the problem domain?

    It's hard to say exactly what the problem domain is. If it's logging or
debugging, then no, I'm not an expert, but I do know more than a little. However
I think the library potentially has much wider applications.

> 1. What is your evaluation of the design?

    I have some serious issues with the design, which I raised with Reece at an
early stage. I really don't have any right to complain about them now, however,
since I offered to collaborate on the library some time ago but then got busy
with other things.

   I will explain what I think the purpose of the library should be, skecth some
concepts and give some example uses. Let me note that I have implemented most of
the following ideas, but only as a proof of concept.

   I see the library as the inverse of Spirit. Spirit takes a linear text and
builds complex objects, while the output formatting library takes complex
objects and renders them as linear text. Just as an abstract syntax tree does
not preserve all the information in the input text, in many cases it will be
desirable to loose information when an object is formatted using the present
library. For example, sometime you might want a dog to be formatted as follows

       [ Dog; name: rover; breed: terrier mix; weight: 80 lb; daily habits:
unspecified ]

Other times, you might just want: [Dog: rover]. Therefore, I think the library
should handle output only.

  I see the library as consiting of three components:

I. Type classification for standard and Boost types, including
    A) a system for classifying types as
        1. variable-length sequences of objects of a single type (example:
std::vector)
        2. heterogenous fixed length sequences of objects (example:
boost::tuple)
        3. types with more elaborate structure, something like XML Schema
content models -- but I never gave this part much thought, so igore it ;-)
    B) function templates for extracting the elements from instances of the
types with the above structures

II. A system for allowing user-defined type to advertise their internal
structure, so that they can be accessed like the types in I. For example, a Dog
class might advertise that it consists of a string name and a float weight.
There are a number of ways that this could be done, such as with
members-pointers, default-constuctible functors which extract the information,
etc. Any combination of these techniques should be allowed.

III A framework of composable formatting objects (I'm using the term differently
than the current library does) used to customize how complex types are output.
    A. The main building block is the concept of a Formatter (sketched below).
There will be a number of built-in formatters, such as
        1. sequence_formatter, for formatting objects of a type I.A.1. using
specified opening, closing and separator strings
        2. nary_formatter<N>, for formatting objects of a type I.A.2. Nary
formatters can be specified with expression templates -- e.g.,

        str("[") << _2 << " : " << _1 << ")"

would format a pair (a, b) as [b : a). (Note the reversed order.) I've also
expeirmented with the following notation, for formatting user-defined types:

    str("Dog:") << member(dog_name)
                << ","
                << member(dog_height)
                << "]"

    B. Styles will be composed from formatters. Formatters can be added to a
style without qualification or with the stipulation that they apply only to
objects of a given type or only to objects of types which satisfy a given mpl
lambda expression. The order in which formatters are specified can create a
cascading effect as in CSS.

    C. A single function boost::io::format, which takes an arbitrary type and
returns an object which can be output using operator<<. Examples:

    cout << boost::io::format(obj); // Uses the default style

    cout << boost::io::format(obj).with(dog_format()) // Doggy-style

    cout << boost::io::format(obj) // Uses a complex style
                    .use< is_vector<_> >( sequence_format("[", ",", "]")
                    .use< is_pair<_> >( str("(") << _1 << ":" << _2 << "]" );

In the last example, nested objects which are standard vectors will be formatted
[a, b, c, d...], while std::pairs will be formatted (a:b]. So a pair of vectors
will look like this:

      ([a,s,d,f,g]:[a,w,w,e,r]],

while a vector of pairs will look like this:

    [(a:b],(c:d],(e:f],(g:h],(i:j]]

This last example suggests that it would be useful to compose formatters and
store them so that they can be reused. Unfortunately, once the static type it
lost, the compex formatting objects are useless in many cases. Ideally one would
use 'auto':

    auto style = cajun_style().use< is_string<_> >( ... )
                              .use< ... >
                              .etc

With the current language, the best way to store styles is to define functions
which return instances of them. This means you have to explicitly describe the
return type, but only once.

    [unspecified style type] cajun_style();

     cout << boost::io::format(obj).with(cajun_style()).

----------------------

Finally, let me describe what a formatter looks like. It is a class type with a
templated member function format having the following signature

    template<typename Ch, typename Tr, typename T, typename Context>
    basic_ostream<Ch, Tr>&
    format(basic_ostream<Ch, Tr>& out, const T& t, Context& ctx);

Here T is the type whose instance is to be formatted, and ctx contains the
prevailing Style (a combination of formatters) as well as contextual information
like depth of nesting and level of indentation. Formatters can specify that they
are able to handle any type or only certain types (such as 3-ary types or types
staisfying an mpl lambda expression).

> 8. Do you think the library should be accepted as a Boost library?
> Be sure to say this explicitly so that your other comments don't obscure
> your overall opinion.

This is difficult. But here goes ...

I think the library should be ACCEPTED, but *only* if it can be done without
major changes. I wouldn't mind some of the ideas that I or others have sketched
being incorporated into a future version of the library. However, if any major
redesign is to be done, I believe another review is crucial, since the various
proposed changes by Reece and others have not been spelled out in sufficient
detail for them to be scrutinized.

Best Regards,
Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk