|
Boost : |
From: Robert Ramey (ramey_at_[hidden])
Date: 2005-03-06 12:17:40
troy d. straszheim wrote:
>> Here's a use case that has been discussed before, but to which I
>> couldn't seem to find any solid resolution the list archives:
>>
>> class A;
>> some_oarchive oa;
>>
>> void mutate(shared_ptr<A>& aptr_)
>> {
>> aptr_ = shared_ptr<A>(new A);
>> };
>>
>> shared_ptr<A> aptr;
>>
>> for (int i=0; i<LARGE_NUMBER; i++)
>> {
>> mutate(aptr);
>> oa << aptr;
>> }
>>
>> The problem is that your oarchive ends up with somewhere between two
>> and LARGE_NUMBER distinct A's in it. Usually two. Presumably the
>> "two" is because in this simple example only A's are getting
>> allocated
>> and deallocated, and therefore there are often free A-sized blocks
>> conveniently laying around for reuse. aptr gets assigned A's at two
>> different alternating addresses.
>>
For what its worth, I believe there is an warning in the document against
this kind of thng. In fact, the documentation says that this will invoke a
compile time error. I just checked - it doesn't produce a compile time error
as I expected. I found the code that does this commented out - Now I don't
remember why I commented it out.
That is, the following are not recommended:
a) changing he state of data while its in the process of be saved.
b) serializating data of the stack. This will break the tracking as
different objects will have the same address and be mis-identified as being
different.
If I had nothing else to do and could figure out how to do it, I would like
implement a warning so that if one used a << operator with a non-const
argument one would get a warning or maybe
I am in the process of adding two more flags to archives:
a) no_tracking - which will suppress tracking for all objects regardless of
the setting of their serialization traits. My motivation for doing this was
to permit the usage of serialization for things such as debug and
transaction logs which would generate cases such as yours above.
b) no_object_creation - which will simple reload pointers rather than
re-create them. My motivation is to permit serialization to be used to
implement the memento patter as described in GoF Patterns book.
I currently have these changes in my local code base. And I've run all my
old tests and they still work. I'm still struggliing with some small issues
regarding loading to stl collections with no_object_creation. I'm also
struggling with some issues related to these flags being runtime rather than
compile time (i.e. template instantiation) options.
I'm missing writing tests, demos and tutorial , and documentation. I'm not
sure, but I think these new facilities may address the use cases raised
here.
>> I looked through the archives a bunch and didn't come across anything
>> conclusive. It seemed that some thought this kind of use case was
>> pathological, but I'm not sure why.
My view has been that changing the state of an object while it is in the
process of being serialized will inevitably lead to program that are not
provably/demonstrably correct. The same goes for archive classes whose
behavior can be changed during the course of serialization.
Now by supporting the usage of serialization for logging - This concept will
be broken. I'm still struggling with this.
a) the idea of serialization of mutable objects does have application on
logging type applications. its appealing to use if for this purpose.
b) It wll break the original concept and lead to cases where errors are
introduced which are almost impossible to track down without tracing into
the implementation of the serialization library itself. This defeats the
whole purpose of having a library in the first place.
>>
>> What I'd like to be able to do is to tell the archive, "The previous
>> calls to operator<<() represent a 'snapshot' of the state of some
>> group of objects, and now I want you to forget about existent objects
>> because I am going to rearrange them all. Continue to track object
>> types, but forget about the addresses." I realize that this creates
>> the possiblity for memory leaks, but if the serialization is done
>> through one toplevel call to operator<< on a shared_ptr whose pointee
>> contains pointers to a whole universe of home-cooked pointer
>> spaghetti, I don't see a better way to do this, and I don't see how
>> to clearly express what I intend via the export and tracking macro
>> mechanisms.
>>You can't close and reopen the archive in the top loop, you
>> get duplicate headers.
What about "no_header" ? and what would be wrong with duplicate headers
anyway? The stream is still open and could just as well contain multiple
archives.
>> The list archives mention the use case of
>> serializing the state of some memory pool that is very likely to get
>> reused: I think the little example above is probably the simplest
>> case of this.
I'm not sure what this means.
>> So without asking for a sanity-check, I implemented
>> basic_oarchive::flush(), and some tests. The changes to
>> basic_oarchive
sanity is sometimes overrated.
>> and basic_oarchive_impl are very small. basic_oarchive has an
>> internal object_set, which tracks object_ids and addresses. I add a
>> num_flushed_objects member, and flush() clears out the set and adds
>> the
>> number of objects flushed to this counter. New tracked objects are
>> assigned object_ids starting at num_flushed_objects +
>> object_set.size(). In this way the class_id's are reused
>> post-flush, but object id's are
>> not. The interface is simply this:
My comments above should make it clear I wouldn't be enthusiastic about this
approach.
Having said that, I find it personally gratifying to find that some people
are so enamored with this library to spend this kind of effort. Certain
people have taken the library in "experimental" directions and I have worked
in their results into the library. Persons who have made significant
contributions are:
Pavel Vozenilek - borland compilers and documentation
Martin Ecker - DLL versions of Serialization and serialization of classes
implemented in DLLS (plug-ins)
troy d. straszheim(you) - serialization of variant.
At the same time I endeavor to keep it from breaking under the weight of its
own success. Its a fine line. Key "fixed points" in my requirements were
and still are:
a) boost acceptance - i need this for my resume as I'm currently looking for
work.
b) support of all compilers on which tests are run.
c) idiot proof user interface and documentation. Ah - maybe i better say,
user interface and documentation such that one can use the library without
having to delve into its implementation. Also its important to me that one
be able to use the library with a very short learning curve - say 1 hour to
get started. Personally, I don't have much more patience than that.
Robert Ramey
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk