Subject: Re: [Boost-bugs] [Boost C++ Libraries] #10740: Multi-level containers do not cooperate with address tracking
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2014-11-27 21:28:08
#10740: Multi-level containers do not cooperate with address tracking
-------------------------------------+-------------------------------------
Reporter: Simon Etter | Owner: ramey
<ettersi@â¦> | Status: closed
Type: Bugs | Component: serialization
Milestone: To Be Determined | Severity: Problem
Version: Boost 1.56.0 | Keywords: Address tracking, STL
Resolution: invalid | containers
-------------------------------------+-------------------------------------
Comment (by Simon Etter <ettersi@â¦>):
I don't understand your posted code snippet. The second line lets `pd`
point to some element in `l`. On the third line from below, you read a new
vector into `l`, which calls `clear()` on `l` (see
collections_load_imp.hpp : 140) and therefore invalidates `pd`. On the
last line, you nevertheless dereference `pd`. This is undefined behaviour.
I think we are still talking past each other. Let me describe the
situation once more for the one-level case. Assume `oa` is any output
archive, and `l` and `pd` are defined and initialized as follows:
{{{
std::vector<dummy> l(1);
dummy* pd = &l.back();
}}}
I call `d` the object of type `dummy` which is located at the address
`&l.back()`. I emphasize that the type and address of `d` are the only
relevant properties here. We first serialize `l`:
{{{
oa << l;
}}}
The implementation of serialization for `std::vector<>` calls serialize on
every element of `l`, thus also on `d`. Next, we serialize `pd`.
{{{
oa << pd;
}}}
According to
[[http://www.boost.org/doc/libs/1_57_0/libs/serialization/doc/serialization.html#pointeroperators]],
the serialization code checks whether an object of type `dummy` at address
`&d` has already been serialized. Since `d` was already serialized as an
element of `l`, this is indeed the case. Thus, we only store some special
tag for `pd`, no actual object information.
Next, we create some input archive `ia` which reads the same file as `oa`
wrote to. We first deserialize the `l` from above, which we now call `l_`:
{{{
std::vector<dummy> l_;
ia >> l_;
}}}
This creates a new object of type `dummy` at some arbitrary address which
we can get through `&l_.back()`. We define `d_` to be this object
identified by the combination of type and address. When deserializing the
`pd` from above into `pd_`,
{{{
dummy* pd_;
ia >> pd_;
}}}
we encounter the special tag that we wrote to the archive instead of the
proper object. By the same section of the documentation as mentioned
above, this should allow the serialization library to recognize that it
does not need to create a new object but rather let `pd_` refer to `d_`.
We check this through the following assert:
{{{
assert(pd_ = &l_.back());
}}}
As mentioned, for a single level of `std::vector`, this indeed works.
The key point is that `oa << l;` and `oa << pd` try to serialize an object
of the same type at the same address. According to the documentation on
serializing pointers, the library detects this situation and stores only
one object. But exactly the same situations occurs for multilevel
containers! From the user's perspective, there is thus no reason to expect
this situation not to work.
----
In the remainder, I'll try to explain why in fact it does not work with
the current implementation. The problem is on lines 61 to 67 in
collections_load_imp.hpp, and the line numbers below refer to this
section. Assume we serialized by running the following code
{{{
std::vector<std::vector<dummy>> l(1);
oa << l;
}}}
and are now about to deserialize this object. The lines
{{{
std::vector<std::vector<dummy>> l_;
ia >> l_;
}}}
cause the following to happen. We read that `l` contained a single
element. We therefore create a temporary (line 62), which we call `tll_`,
and deserialize this single element into it (line 64). Now `tll_` is a
`std::vector<dummy>` of length 1. Next, we call `l_.push_back(tll_)` (line
65). Since we would like future pointers to this "logical" object to point
to `&l_.back()` and not `&tll_`, we tell the new address to the archive by
calling `reset_pointer_address()`. At this point, the error happens: We
would also have to tell the archive that we want future pointers to
reference `&l_.back().back()` and not `&tll_.back()`. We cannot do this,
however, because at this point in the code we don't know that `tll_`
(which corresponds to the variable `s` in the actual code) is in fact a
vector. In conclusion, I thus know where the bug is (and I am pretty sure
it is in fact a bug), but I don't know how to solve it.
-- Ticket URL: <https://svn.boost.org/trac/boost/ticket/10740#comment:5> Boost C++ Libraries <http://www.boost.org/> Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:17 UTC