Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2005-05-12 03:57:03


Robert Ramey wrote:

>
>>> OK let me rephrase. This code conflicts with the description of the
>>> description of the functioning of the operator << as described in the
>>> documentation.
>>
>> Can you give me specific doc URL? I can't find the relevant part in
>> TOC.
>
> Serialization/Reference/Special Considerations/Object tracking
>
> explains this.

Ok, I found this.

>>> referene and still be modified during the process of serialization.
>>> So this is not bullet proof - but I believe it is very helpful.
>
>> Let me draw an analogy: there's std::ostream. For years, we save
>> objects to ostreams and we're not required to save "const" object.
>> The fact that serialization acts in a different way is inconsistent
>> and confusing. I
>> think you'll agree that users should not read documentation just to
>> save an object.
>
> std::ostream doesn't need to track the objects saved. So passing objects
> which change while in the course of creating an output file doesn't create
> a
> problem, while in the serialization system it DOES create an error.

I'll comment on the error later. But the important point is that difference
from iostream, no matter how much explanation you put in docs, will be
confusing. Moreover, as soon as users get into habit of const_casting
serialized things, your protection no longer works.

   for(;;) {
        A x = a[i];
        ar << x;
   }

User gets error above changes this to

   for(;;) {
        const A x = a[i];
        ar << x; // Oops, error here, change this to
   }

and circumvents your protection. If there's no way to serialize non-const
object, users will just start using const.

>> And let me clarify again -- is this indended to stack only stack
>> allocated objects?
>>
>
> No. Its for any object whose contents might change while in the course of
> creating an archive.
> Example.
>
> X * x;
> ..
> ar << x;
> ....
> *x = y;
> ...
> ar << x; // uh - oh bug introduced since x is being tracked.
>
> The static assert will flag this as an error at compile time.

Ok, at least I understand your intentions now. But again note that if this
static check triggers in situation users consider save, they'll quickly
learn to use casts. Everywhere.

>>> Its very easy to write
>>> for(...{
>>> X x = *it; // create a copy of
>>> ar << x
>>> }
>>>
>>> all the x's are at the same address so if they happen to be tracked
>>> because a pointer to some X in serialized somewhere in the program,
>>> then the
>>> subsequent copies would be supprsed by tracking.
>>
>
>> So, you're saying that the behaviour of the above code will be
>> different, depending on whether some pointer to X is saved somewhere
>> else? This is very strange -- here, saving is not done via pointer, so I
> expect
>> that no tracking is ever done. I'd argue this is a bug in
>> serialization.
>
> Hey - that's not a bug - its a feature !
>
> tracking is only done if its necessary to guarentee that the objects can
> be loaded correctly.

I understand the need for tracking. What I don't understand is why tracking
is enabled when I'm saving *non pointer*.

Say I'm saving object of class X:

   X x;
   ar << x;

Then, for reasons I've explained in the previous email, no heap allocated
object will have the same address as 'x', so no tracking is needed at all.

>> The only problematic case is
>>
>> for(...{
>> X x = *it; // create a copy of
>> ar << x
>> }
>>
>> X* x = new X:
>> ar << x;
>>
>> where address of newed 'x' is the same as address of saved 'x'. But
>> this can never happen, because heap memory and stack memory are
>> distinct.
>>
>
> In the loop, a new value for x is set every iteration through the loop.
> But
> the address of x (on the stack) is the same every time. If X is tracked,
> only the first one will be saved. When the archive is loaded, all the x
> values will be the same - not what is probably intended. So, the question
> is what is really intended here and trapping as an error requires that
> this question be considered.

Exactly. As I've said above, I believe saves of 'x' inside the loop should
not do tracking since we're not saving via a pointer.

>>> With the above
>>> formulation this cannot happen. This error is detected a compile
>>> time - very convenient. Of course the following would work as well
>>>
>>> for(...{
>>> const & X x = *it; // save a const reference to the orgiinal
>>> ar << x
>>> }
>>
>> Will that indeed save a const reference? How will you read const
>> reference from an archive?
>
> This works as one would expect - that is all the objects are saved and
> tracked separtly.
> The const reference doesn't have its own address - taking the address of
> it returns the address of the object being refered to - just what we want
> for tracking.

This behaviour is strange. In C++ reference almost always acts like
non-reference type, and it's common to use referece to create convenient
local alias. I'd expect saving of "const &X" to works exactly as saving of
"const X".

>>> while the following would create an error that would go undetected.
>>>
>>> for(...{
>>> const X x = *it; // create a copy of
>>> ar << x
>>> }
>>
>> Probably the right way to fix this is just don't track 'x' here.
>> Again, no heap allocated object will have the same address as 'x'.
>
> in this case x is being allocated on the stack - NOT on the heap. And all
> the x's have the same stack address.

Yes, they have the same stack address, but it does not matter, because we're
not saving them via pointer. If we wrote something like:

   ar << &x;

then the object would have to be tracked. But saving pointer to stack object
is problematic on its own.

>>> When finish returns something that is not a const it suggests that
>>> its returning a pointer to a mutable object. The serialization
>>> library presumes
>>> that objects don't change in the course of serialization. So
>>> passing a non-const conflicts with one of the assumptions made in
>>> the implementation of the library.
>>
>
>> Ehm... the standard way to solve this is to declare "operator<<" with
>> the "const T&" type, just like iostreams do.
>
> We're doing this now. that is
>
> Archive::operator<<(const T & ...)
>
> But using const indicates that the callee won't mutate the object. It
> doesn't require that the object being passed be imutable - which is what
> we're trying to check for here. In other words even though a paramter is
> declared const - that doesn't require tha the object passed be const. In
> fact, the compiler is permitted to create a copy of an non-const and pass
> it to a const paramter. I haven't seen a compiler actually do this - but
> it is
> permitted.

The only case I know about is binding rvalue to const reference.

> It would create havoc with the tracking system which is
> essential to re-creating archived pointers. The new code follows
> compilier rules more carefully and should work even if some compiler makes
> a copy to when converting a non-const paramter to a const.

Again, when saving non-pointer no tracking should be necessary.

>>> If its a temporary you could
>>> just
>>> as well return an auto_ptr to a const T. But really without more
>>> context its hard for me to commment.
>>
>> 1. I don't want to change my design due to serialization library. It's
>> intrusive.
>
> This trap is telling you that maybe something in your design conflicts
> serialization in a fundamental way and this should be checked.

But I've checked and nothing's wrong. So I either have to modify my design
-- which I don't want, or add very strange-looking cast.

>> 2. Inside 'finish', I need to work with auto_ptr<T> -- because I
>> modify the data.
>
> That's it. when you call finish from with an archive save, you're telling
> me that the process of saving is changing the data. The serialization
> library implementation (tracking) assume that won't happen. So if your
> design is such that this is what you really want to do, then T should be
> marked track_never. This will inhibit tracking and suppress the error
> message.
>
> Without this checking, this issue would go undetected until runtime - and
> during a load at that. And be very difficult to find.

Sorry, probably I did not elaborate enough. The finish code looks like:

        auto_ptr<Data> finish()
        {
                auto_ptr<Data> result;
                // modify result
                return result;
        }

I modify 'result' inside 'finish', not inside any serialization code. And I
can't change return type to auto_ptr<const Data> because the code won't
compile. And

   return auto_ptr<const Data>(result.release())

is scary.

So, there's no bug in my code yet ;-)

>>>> 2. How do you suggest me to fix the above?
>>>
>>> I did suggest using & instead. But I'm curious to see more context.
>>> I'm curious to see more.
>>
>> Here's the enclosing method:
>>
>> template<class Archive>
>> void save(Archive & ar, unsigned int version) const
>> {
>> // Serialization has problems with shared_ptr, so use strings.
>> ar << boost::lexical_cast<std::string>(*this);
>> }
>>
>
> LOL - well I would disagree that serialization has problems with
> shared_ptr - My view is that shared_ptr has problems with serialization.
> The next version will reconcile differing points of view on this subject.

That would be much welcome, no matter how it's done.

> That aside I would expect to use something like the following
>
> template<class Archive>
> void save(Archive & ar, unsigned int version) const
> {
> // Serialization has problems with shared_ptr, so use strings.
> const std::string s = boost::lexical_cast<std::string>(*this);

I don't like the copy.

> ar << s
> }
>
> Maybe the above might be
> ar << boost::lexical_cast<const std::string>(*this);

I think this wont work because loading from stream to "const std::string"
won't compile.

> template<class Archive>
> void load(Archive & ar, unsigned int version) // note no "const" here
> {
> // Serialization has problems with shared_ptr, so use strings.
> std::string s
> ar >> s;
> *this = s; // whatever means
> }
>
> Now this could create a tracking problem.

Tracking problem here? I'm just saving a string!

> So if ths is really what needs
> to be done, I would create an wrapper:
>
> struct untracked_string : public std::string {
> ...
> };
>
> and set untracked_string to "track_never"

I hope you'll agree that this solution is rather inconvenient.

> BTW in the case above we're not serialization a shared_ptr as far as I can
> tell so I would expect
> to just have
>
> serialize(Archive &ar, unsigned int version){
> ar & ... members
> }

There's shared_ptr somewhere inside 'members'.

> Its really odd to me to see something like
>
> ar << *this
>
> or
>
> ar << *ptr
>
> which means we're circumventing and maybe re-implementing part of the
> serialization library.

I think that for a split 'save/load' you'll always have to use operator<<
and operator>> of the archive.

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk