Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2005-05-11 02:01:27


Robert Ramey wrote:

>>>>>
>>>>> ofstream ofs(av[2]);
>>>>> boost::archive::binary_oarchive oa(ofs);
>>>>> oa << *data_builder.finish();
>>>>>
>>>
>>> This code should be considered erroneous. The documentation in 1.32
>>> addressed this but unfortunately the enforcement was lost in the
>>> shuffle.
>>
>> Excuse me, but I refuse to accept that this code is erroneous. As
>> I've said, this is my existing design -- the 'finish' function
>> returns non-const pointer. The fact that somewhere else, I serialize
>> the result of 'finish'
>> has nothing to do with 'finish' return type, and I should not be
>> forced to change my existing interface just because I use
>> serialization somewhere.
>
> OK let me rephrase. This code conflicts with the description of the
> description of the functioning of the operator << as described in the
> documentation.

Can you give me specific doc URL? I can't find the relevant part in TOC.

>> So, you're saing that "const" == "no allocated on stack"? I don't see
>> why this statement is true. I can just as well do this:
>>
>> void foo(const A& a)
>> {
>> ar << a;
>> }
>>
>> and circumvent your protection.
>
> The intention is that const indicates that the process of serialization
> will
> not change the object being serialized. Serializing and object that
> cannot
> easily be passed as a const is quite possibly an error. Of course, the
> contrary is not necessarily true. That isan object

What's "isan"?

> can be passed as const
> referene and still be modified during the process of serialization. So
> this is not bullet proof - but I believe it is very helpful.

Let me draw an analogy: there's std::ostream. For years, we save objects to
ostreams and we're not required to save "const" object. The fact that
serialization acts in a different way is inconsistent and confusing. I
think you'll agree that users should not read documentation just to save an
object.

And let me clarify again -- is this indended to stack only stack allocated
objects?

>>> Note that the documentation suggests that he above be reformulated as
>>> For(...
>>> ar << a[i]; //
>>>
>>> Enforcing const-ness also has the effect of preventing serialization
>>> from altering the state of the object being serialized - another
>>> almost impossible to find bug.
>>
>
>> Can you explain:
>> 1. How this bug can happen
>
> Its very easy to write
> for(...{
> X x = *it; // create a copy of
> ar << x
> }
>
> all the x's are at the same address so if they happen to be tracked
> because a pointer to some X in serialized somewhere in the program, then
> the
> subsequent copies would be supprsed by tracking.

So, you're saying that the behaviour of the above code will be different,
depending on whether some pointer to X is saved somewhere else? This is
very strange -- here, saving is not done via pointer, so I expect that no
tracking is ever done. I'd argue this is a bug in serialization.

The only problematic case is

 for(...{
     X x = *it; // create a copy of
     ar << x
 }

 X* x = new X:
 ar << x;

where address of newed 'x' is the same as address of saved 'x'. But this can
never happen, because heap memory and stack memory are distinct.

> With the above
> formulation this cannot happen. This error is detected a compile time -
> very convenient. Of course the following would work as well
>
> for(...{
> const & X x = *it; // save a const reference to the orgiinal
> ar << x
> }

Will that indeed save a const reference? How will you read const reference
from an archive?

> while the following would create an error that would go undetected.
>
> for(...{
> const X x = *it; // create a copy of
> ar << x
> }

Probably the right way to fix this is just don't track 'x' here. Again, no
heap allocated object will have the same address as 'x'.
 
>> 2. Why the "watch" command on gdb is not a reliable way to catch such
>> bugs?
>
> I'm not sure what the gdb watch command does but I'm sure it doesn't
> detect compile time errors.

It allows to find all places where a specific memory location is modified.
So, if you have a bug where a value of an object is modified by
serialization, it's easy to track this down to specific code line.

>>> Also, I had to tweak a number of my tests and demos to make them
>>> work with this new rule. However, the tweaking was not difficult. If
>>> altering one's code to conform this rule isn't easy, one should make
>>> an effort to understand why. Its possible that a very subtle and
>>> hard to understand bug might be being introduced.
>>
>> Are you saying that my code is buggy?
>
> I'm saying you've passed up an opportunity to permit the compiler to flag
> someting that could be an error.
>
>>>>> ofstream ofs(av[2]);
>>>>> boost::archive::binary_oarchive oa(ofs);
>>>>> oa << *data_builder.finish();
>
> Looking at the above, I have to concede I never envisioned it being used
> this way. I would need more context to really comment intelligently on
> it. I will say that I alwas envisioned the interface as being used in a
> declarative style. That is, I envisioned the the serialization
> declarations
> ar & x or ar << x as a shorthand for "this member is persistent". I'm
> not saying that you're doing anything wrong, its just not what I expected.

What kind of context should I provide? This is top-level code, it reads some
files, passes them to 'data_builder' and then calls 'finish' that builds
the proper data which is then saved.

>>>>> Where 'finish' returns auto_ptr<Data>. It's looks like
>>>>> serialization checks if the serialized type is 'const' and if not,
>>>>> complains.
>
> correct
>
>>>>> Basically, it's some heuritic to prevent saving
>>>>> on-stack object
>
> That's one case I would like trap.

Any others?

>>>>> (though I don't understand why it would work at al).
>
> I don't understand this.
>
>>>>> I find this a bit too much . I have no reason whatsoever to make
>>>>> 'finish' return auto_ptr<const Data>. And writing
>>>>>
>>>>> oa << const_cast<const Data&>(*data_builder.finish());
>
> When finish returns something that is not a const it suggests that its
> returning a pointer to a mutable object. The serialization library
> presumes
> that objects don't change in the course of serialization. So passing a
> non-const conflicts with one of the assumptions made in the implementation
> of the library.

Ehm... the standard way to solve this is to declare "operator<<" with the
"const T&" type, just like iostreams do.

>>> That is why the STATIC_ASSERT only trips when
>>> tracking is enabled for the corresponding datatype.
>>>>> looks very strange. And modifying all places where I get this error
>>>>> is not nice too. So, can this static assert be removed?
>
> Its there for a purpose. How many places do you get this error? Returning
> an auto_ptr to the top of the stack is interesting to me. doesn't that
> destroy the one in the original location?

No, returning auto_ptr from a function is one of its intended usages.

> If its a temporary you could
> just
> as well return an auto_ptr to a const T. But really without more context
> its hard for me to commment.

1. I don't want to change my design due to serialization library. It's
intrusive.

2. Inside 'finish', I need to work with auto_ptr<T> -- because I modify
   the data. And conversion from auto_ptr<T> to auto_ptr<const T> does not
   work for me on gcc.

>> Ok,
>> 1. Why std::string is 'tracked type'?
>
> so that storage space isn't wasted storing repeated strings
>
>> 2. How do you suggest me to fix the above?
>
> I did suggest using & instead. But I'm curious to see more context. I'm
> curious to see more.

Here's the enclosing method:

  template<class Archive>
  void save(Archive & ar, unsigned int version) const
  {
    // Serialization has problems with shared_ptr, so use strings.
    ar << boost::lexical_cast<std::string>(*this);
  }

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk