Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2004-11-12 11:54:26


[I apologize if this message shows up twice -- I sent it last night and it still
hasn't appeared]

 "Pavel Vozenilek" <pavel_vozenilek_at_[hidden]> wrote in message
news:cn0l08$cul$1_at_sea.gmane.org...
>
> "Jonathan Turkanis" wrote:
>
> > > // specialization for debug formatting
> > > template<>
> > > void serialize<formatted_text_oarchive>(....) {
> >
> > I believe this specialization is illegal. You could write
> >
> > void serialize(formatted_text_oarchive&ar, const unsigned)
> >
> > but I can't say whether this will work (I seem to remember Robert saying
> > somewhere in the documentation that he was relying on the fact that
> > non-templates are better matches than templates.)
> >
> Yes, this (nontemplated overloaded function) will work.
> I just wrote down an idea in haste.
>
>
> > I have two separate ideas for formatting libraries:
> >
> > - one lightweight, which I posted, for input and output of ranges and
> > tuples-like objects
> > - one for output only, which allows much more customization; I see this as
> > an inverse of Spirit
> >
> > Your suggestion looks similar to the second (except that you want to
> > support input), so let me sketch my idea (which has improved since I
> > sketched it here last time), and then ask some questions about yours.
> >

> Formatted input would be optional (and maybe not practical).
>
> I do not understand what are advantages of the "lightweight"
> approach (except compile time).

The aim of Format Lite was to present a facility which would be a candidate for
standardization, filling a need that was expressed at the most recent library
working group meeting. Towards this end,

1. I tried to keep the implementation as small as possible
2. I introduced just one new function template -- punctuate() -- in the public
interface
3. I used the same syntax currently recommended for formatting user-defined
types (overloading iostreams operators >> and <<)
4. I introduced no new class templates in the public interface
5. I introduced no new concepts. (Strictly speaking, I use Single Pass Range and
Extensible Range, but these could be replaced by the standard library container
concepts.)

If people like the interface (and so far there's not much evidence), I think
Format Lite would stand a reasonable chance of making it into TR2.

To get Serialization standardized would require a much bigger push, IMO,
although I'd like to see it happen -- perhaps with additional language support.

> Is switch between lightweight and heavyweight solution easy?

Yes, since formatting with Format Lite would still be the default when a Style
provides no specific formatting options for a given range or tuple-like type. In
the following

      vector< string > v = list_of( ... );
      ostream out;
      styled_ostream<cajun_style> cajun_out(out);
      cajun_out << v;

If none of the Styles or Formatters associated with cajun_style knows how to
format a vector, cajun_out will delegate formatting to the underlying ostream
out, which will use the operator<< from Format Lite.

> [snip]
> > The advantages of this approach are:
> >
> > - an arbitrary amount of contextual information, such as indentation
> > and numbering, can be stored in the styled_stream and accessed directly
> > by formatters
> > - arbitrary user-defined types can be formatted non-intrusively
> > - flexible formatting is built-in for sequences and tuple-like types
(and
> > user-defined types can choose to present themselves as sequences or
> > tuple-like types to take advantage of this feature.)
> >
> I feel having formatting descendant of boost::archive stream
> could be made with the same features.

Good.

> > > The advantages I see:
> > > - the whole infrastructure of Boost.Serialization is available and
> > > ready and it handles all situations like cycles. format_lite could
> concentrate on just formatting.
> >
> > This is a big plus, obviously. (However, I remember Robert saying her
> > prefered to keep formatting and serialization separate.)
> >
> Formatting, as I see it would just use Serialization as
> infrastructure. There would be no inpact on Serialization
> from Formatting.

You mean no changes to the library code -- we would just define additional
archive concepts and types?

> > > - formatting directives can be "inherited" from
> > > "higher level of data" to "lower levels". Newly added data would
> > > not need formatting of its own by default. Change on higher level
> > > would propagate itself "down".
> >
> > Can you explain how this works?
> >
> I mean trick with RAII (its not really Serialization feature):
>
> void serialize(formatting_archive& ar, const unsigned)
> {
> // change currently used formatting style
> formatting_setter raii(ar, "... formattng directives...")
> ar & data; <<== new formatting style will be used
> // destructor of raii object will revert formatting back
> }

I see. I think this is a characteristic of all schemes where stylistic info is
stored in the stream or stream-like object.

> > > - indentation for pretty printing could be handled (semi)automatically
> > > by formatting archive.
> >
> > Would this involve modifying the archive interface? I'd like a formatter
> > for a given type (or an overloaded serialize function) to be able to access
> > these properties directly.
> >
> Yes, formatting archive could have any additional interface.

I see. But no changes to existing archive types.

> > > - multiple formatting styles could be provided for any class.
> >
> > It would be one formatting style for each archive type for which serialize
> > has been specialized, correct? Would this allow styles for various types to
be
> > mixed freely?
> >
> Yes, serialize() function would be specialized.

What I meant to ask can be illustrated by an example. Suppose you have two
classes, Duck and Goose. Duck and Goose each have two associated formatting
styles. The choice of styles should be independent, so we would need four
archive types to handle the various combinations.

Now my question is: would Duck need four specializations of serialize, or just
two? In my system, formatting options for Duck and Goose could be added to a
Style independently; I want to know if overloading serialize can handle this.

> I see three ways to customize output:
>
> 1. Formatting archive has its own configuration how
> to output data. This keeps overall style coherent
> and should be enough for most uses.
>
> 2. Specialization of serialize() could change
> formatting style. This may be used to fine
> tune the output here or there.
>
> 3. Specializations of serialize() may generate
> different outputs altogether.
>
> E.g. if you have archives:
> class high_level_formatting_archive {...}
> class all_details_formatting_archive { ... }
>
> you can omit details in
>
> void serialize(high_level_formatting_archive& ar, const unsigned);
>
> and use them all in
>
> void serialize(all_details_formatting_archive& ar, const unsigned);
>
> I think this (option 3) is not possible now with format_lite.

In fact, it only supports 1. That's part of what makes it 'lite'. If a class
already provides standard library inserters and extractors (corresponding to 2,
above), those provided by Format Lite will not be called.

I have a couple of questions:

1. Is your idea flexible enough to allow pairs (a,b) to be formatted with the
elements in reverse order?
2. If a type defines a member function serialize, can it be bypassed altogether
in favor of an end-user supplied formatting style?

> > My inclination is to
> > keep formatting separate from serialization, though, because they have
different
> > aims. If you believe they can be integrated, without making serialization
harder
> > to learn or sacrifying flexibility of formatting options, it should
definitely be
> > considered.

> I see Serialization as just vehicle, ready and handy and almost
> invisible to Formatting users.

> Simple data structures are very easy with Serialization and should
> be as easy as with format_lite now.

> If user tries to format tricky structures (e.g. pImpl) he would need
> to dig into Serialization docs but at least there will be chance to
> make the whole thing work. The Serialization goes to great lengths
> to work under any situation and configuration.

This sounds quite reasonable, provided it is sufficiently flexible.

I wonder how much of the Serialization infrastructure is really needed, though.
Detecting cycles is definitely not something I want to reimplement; OTOH, I'm
not
sure it's needed for pretty-printing. I haven't looked at the Serialization
implementation, but I did read the Java serialization specification several
years ago. IIRC, when an object was encountered for the second time, some sort
of placeholder would be inserted in the stream referencing the already
serialized data. I assume the Serialization library does something like this.
Would this really be desirable for human-readable output? Perhaps the formatting
library should concern itself only with cycle-free data structures.

> /Pavel

Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk