Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2005-06-23 09:13:00


On Thursday 12 May 2005 20:19, Robert Ramey wrote:

> I'm aware that many programmers avoid using "const" because its seems to
[snip explanations why 'const' is good]
> But all in
> all I've come to strongly believe that picking b) is a much better choice
> and my handling of this issue in the serialization library reflects that.

Each check is only reasonable if it finds more bugs than it causes problems.
We seem to disagree about the proportion for the *specific case* of the check
in serialization library.

> > Ok, at least I understand your intentions now. But again note that if
> > this static check triggers in situation users consider save, they'll
> > quickly learn to use casts. Everywhere.
>
> Of course your correct on this. But lot's of people drive without seat
> belts too. That doesn't mean that the rest of us should be prohibited from
> using them.

I don't think the analogy is correct. I guess if you were required to refasten
the belt each time you change gear, you won't be using it.

> > I understand the need for tracking. What I don't understand is why
> > tracking is enabled when I'm saving *non pointer*.
> >
> > Say I'm saving object of class X:
> >
> > X x;
> > ar << x;
> >
> > Then, for reasons I've explained in the previous email, no heap
> > allocated object will have the same address as 'x', so no tracking is
> > needed at all.
>
> Default trait is non-primitive objects is "track_selectivly" This means
> that objects will be tracked if and only anywhere in the source, some
> object of this class is serialized through a pointer. So when I'm
> compiling one module above and checking at compile time I realy don't know
> that the object will in fact be tracked. So I trap unless the serialization
> trait is set to "track never". To reiterate, in this case the object won't
> be tracked unless somewhere else its serialized as a pointer.

I'm not sure this behaviour is right. It certainly matters if I save the
*same* object as pointer or not. Why does it matter if I have *another*
object by pointer.

Suppose you've saving an object with the same address twice. There possible
situations are:

1. Both saves are via pointers. You enable tracking for this address; only
one object is actually saved. User is responsible for making sure that the
object does not change between saves.

2. First save is via pointer, the second is not by pointer. You throw
pointer_conflict.

3. First save is not by pointer, second is by pointer. You ehable tracking for
this address.

4. Both saves are not via pointer. You don't track anything.

Is there anything wrong with above behaviour?

> As an aside one might want to track object never serialized as pointers.
> That's why there is a serialization trait "track_always". This might occur
> where objects might be the objects of references from several instances of
> another class:
>
> class a (
> X & m_x;
> ....
> };
>
> tracking would guarentee that only one copy of the same X would be written
> to the archive - thus saving space. This would be an unusual case but its
> supported if necessary.

And how would you deserialize this, given that references are not rebindable?

> >>> for(...{
> >>> X x = *it; // create a copy of
> >>> ar << x
> >>> }
> >>>
> >>> X* x = new X:
> >>> ar << x;
> >>>
> >>> where address of newed 'x' is the same as address of saved 'x'. But
> >>> this can never happen, because heap memory and stack memory are
> >>> distinct.
> >>
> >> In the loop, a new value for x is set every iteration through the
> >> loop. But
> >> the address of x (on the stack) is the same every time. If X is
> >> tracked, only the first one will be saved. When the archive is
> >> loaded, all the x values will be the same - not what is probably
> >> intended. So, the question is what is really intended here and
> >> trapping as an error requires that this question be considered.
> >
> > Exactly. As I've said above, I believe saves of 'x' inside the loop
> > should not do tracking since we're not saving via a pointer.
>
> How do we know that one of the x's saved in the loop is not serialized as a
> pointer somewhere else?

You keep a set of addresses of all saved objects.

> We have to track ALL x's because we don't know
> which ones if any are being tracked somewhere else. It could even be in a
> different module.

Right, you need to track all addressed while saving, but in archive the saves
from the above loop need not be marked as tracked.

> > This behaviour is strange. In C++ reference almost always acts like
> > non-reference type, and it's common to use referece to create
> > convenient local alias.
> >
> > I'd expect saving of "const &X" to works exactly as saving of "const X".
>
> saving it does. The problem is the
>
> const X x = some_x;
>
> is quite different than
>
> const X & x = some_x
>
> That is creating a reference is altogether different from making a copy of
> an object. The strong analogy between these operations and the automatic
> invokation of the copy operation constitutes one of the main features of
> C++ which can be considered both a blessing and curse. On one hand it
> makes the language expressive by hiding the natural copies while on the
> other hand these hidden copies are a major source of hard to find bugs.
> In any case, its out of my hands.

I don't understand anything of the above. To give another example:

  const X x;
  const X& x2 = x;

are you saying that saving them works differently?

> > But I've checked and nothing's wrong. So I either have to modify my
> > design -- which I don't want, or add very strange-looking cast.
>
> You have three other options:
> a) use & operator instead of <<
> b) set the tracking trait to "track_never"
> c) tweak your code so the trap is never invoked. (hypothetical only)
>
> By the way the const_cast is a good choice for another reason. Its
> specifically flags a case which should be checked if the program has
> surprising behavior. Suppose you've checked everything and its what you
> want to do so you put in a const_cast to avoid the trap. Then months later
> you add a module to your program which serializes a pointer to X. Now your
> code is broken in a surprising way. When you start debugging you'll see
> the "const_cast" and it might draw your attention so something that should
> be checked.

If saving unrelated pointers does not magically change save behaviour of all
other 'X' instances, then adding another module won't break my program in the
first place.

> which we are quite comfortable with. In fact I believe
>
> auto_ptr<const Data> finish()
> {
> auto_ptr<Data> result;
> // modify result
> return result;
> }
>
> expresses your intention quite well. That you're returning an auto_ptr to
> an object that you don't expect should be changed by anyone who gets the
> pointer this way.

Except that it does not compile.

> >>>>> 2. How do you suggest me to fix the above?
> >>
> >> That aside I would expect to use something like the following
> >>
> >> template<class Archive>
> >> void save(Archive & ar, unsigned int version) const
> >> {
> >> // Serialization has problems with shared_ptr, so use strings.
> >> const std::string s = boost::lexical_cast<std::string>(*this);
> >
> > I don't like the copy.
> >
> >> ar << s
> >> }
> >>
> >> Maybe the above might be
> >> ar << boost::lexical_cast<const std::string>(*this);
> >
> > I think this wont work because loading from stream to "const
> > std::string" won't compile.
>
> I believe we're doing the opposite here. creating a const std::string (on
> the stack) from a non-const one. This is exactly equivalent to the above
>
> const std::string s = boost::lexical_cast<std::string>(*this);
> ar << s;
>
> its just that the copy is hidden.

Did you check lexical_cast.hpp? It will try to change 'const std::string'.

> >> So if ths is really what needs
> >> to be done, I would create an wrapper:
> >>
> >> struct untracked_string : public std::string {
> >> ...
> >> };
> >>
> >> and set untracked_string to "track_never"
> >
> > I hope you'll agree that this solution is rather inconvenient.
>
> I do. That's why I prefer one of the other ones above. But it does point
> to an interesting issue. The seralization traits to primitives are set to
> "track_never". Its sort of an easily remembered hueristic in that usuallly
> we wouldn't want to track all ints even if someone want's to serialize a
> pointer to an int somewhere in the program.
>
> However, if someone does serialize a pointer to an int, then it won't be
> tracked and that could create an error.

I recall that in earlier versions users simple *could not* serialize a pointer
to int. Did that change or I am wrong?

> > I think that for a split 'save/load' you'll always have to use
> > operator<<
> > and operator>> of the archive.
>
> not true- insied save/load one can just as well use &

That would be non-intitive a bit.

> What is odd about the above in my mind is the ptr dereferencing. If we're
> already inside the object, why not just serialize the members? doesn't ar
> << *this just make a recurrsive call itself?
>
> I'm curious if anyone else is following this thread. Its getting pretty
> deep in the details of the serializaiton library.

Yea, looks like nobody cares much.

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk