Boost logo

Boost :

Subject: Re: [boost] [serialization] Dealing with any tainted types.
From: David Sankel (camior_at_[hidden])
Date: 2011-01-14 16:36:23


Thanks for your detailed response Robert...

On Thu, Jan 13, 2011 at 4:38 PM, Robert Ramey <ramey_at_[hidden]> wrote:

> David Sankel wrote:
> > The serialization library allows for the painless serialization of
> > most data types and even polymorphic types like variants and vectors.
> >
> > There is one type, however, that doesn't have a serialize function
> > that makes sense. That type is any. It is easy to see that if we make
> > an any serialize function, we're going to need to make some
> > assumptions at least about which subset of types it might contain.
> >
> > I have a complex algebraic datatype, lets call it C, that has a
> > member (that has a member...) of type T. T is "tainted" with a member
> > of type any. I'm looking at the possible ways to serialize C with
> > boost.serialization. We can know, at runtime, which subset of types
> > the any member of T can contain and how to serialize each of those
> > types. Lets call the dictionary type that has that information D.
> >
> > Here are the options I've come up with so far...
> >
> > Option 1:
> >
> > Make some global variable of type D, called d. Before calling any
> > serialization of C, I ensure d has the correct value. The serialize
> > function for T uses d to figure out how to serialize the any member.
> >
> > This option would certainly work, but I'm using a global variable as a
> > workaround of the fact that I cannot add arguments to T's serialize
> > function. No points for beauty here.
>
> take look at how shared_ptr is serialized. Seems to me a similar
> problem. This was handled by adding a "helper" class just for this
> shared_ptr type. Such a "helper" could hold the otherwise "global"
> variable just for that archive instance, this maintaining the
> thread-safe characterstic of the library..
>
> > Option 2:
> >
> > Instead of serializing a value of type C, serialize a value of type
> > "struct CWithDict { C c; D d; }". In this serialization function I
> > can use d whenever I need.
> >
> > Unfortunately the contents of this serialize function would need to
> > duplicate most of the functionality of boost.serialize in the first
> > place since T is buried deep within C's structure. Although this
> > option works, it requires rewriting a bunch of serialize which isn't
> > attractive.
>
> Take a look at "extended type info". This extends the rtti system
> to handle types identified by a string at runtime. This is the basis
> for the "export" functionality.
>
> > Option 3:
> >
> > Make an archive wrapper:
> >
> > template< typename Archive >
> > struct DArchive
> > {
> > Archive a;
> > D d;
> > };
> >
> > DArchive would model the Archive concept by forwarding functionality
> > to a. However, the T serialize function could access DArchive's d
> > member for serialization of the any.
> >
> > This would solve the problem at the expense of extending the meaning
> > of Archive a bit. It seems pretty elegant to me.
>

In retrospect, this clearly won't work. As soon as a << x happens, d
is unavailable for the serialization of all of x's descendants.

> This seems similar to the "helper" described above. That is there is the
> concept of a "naked_text_iarchive". text_archive looks something like:
>
> class text_iarchive : public naked_text_archive, shared_ptr_helper
> {
> ...
> };
>
> This seems similar to what you want to do.
>

yup. I was able to hack something up to do what I want. But...

> <snip>
> To really do this right, I see the following as necessary
>
> a) Clarify and simplify the current archive concept. I've thought about
> this alot and know what I want to do - but I'm not excited enough to
> do it.
>

I've been giving some thought to this. Not as much clarifying and
simplifying, but more distilling the essence of the domain. I have a feeling
that if we nail the essence down, all the compositionality will be there
without having to tack it on as an afterthought.

Here's what I have so far:

concept Archive:
  struct _ where
  { typedef _ RState
  ; typedef _ WState

  ; template< typename T >
    struct lookup
    { typedef _ type // This _ is either mpl::void_ or
                      // std::pair< function< void (RState&, const T&) >
                      // , function< T (WState&) >
                      // >
    ; type operator()() const;
    }
  };

Something fits the archive concept if they fill in the blanks above. RState
and WState are state information required for reading and writing. The
lookup type function, passed type T, will either return mpl::void_ or a
pair. If it returns void_ we know that T is not considered a primitively
serializable type. If it returns the pair, we know it is a serializable type
witnessed by the pair of write and read functions returned by operator().

One key condition is that the primitive types for an Archive, if they are
templates, must be *fully saturated*. Meaning that:

template<>
lookup< std::vector<bool> >

is fine, but

template<typename T>
lookup< std::vector<T> >

is not. This condition prevents recursive lookup calls with non-primitives.
This, I think, is going to be the key to compositionality later. More to
come...

Does all of this make sense so far?

David

-- 
David Sankel
Sankel Software
www.sankelsoftware.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk