Boost logo

Boost :

Subject: Re: [boost] [Boost.Serialization] Help with PR fixing a memory leak in the current version
From: Robert (r.firl_at_[hidden])
Date: 2018-06-27 21:19:10


On 6/26/2018 3:39 AM, Alexander Grund via Boost wrote:
> Hi,
>> I much appreciate your willingness to invest effort in this issue. But
>> my analysis and conclusions regarding the issue are totally different.
>> Motivated your efforts I added new tests for the singleton and spent
>> time enhancing tests for dll.  One test was just to complicated for me
>> to make work with b2 so I suppress it. Only one test is failing on
>> some platforms - test_export.  This is related to DLLs.  I added
>> windows + mingw and msvc++ to my machine just to try to replicate the
>> failure in my own environment.  I have been unable to do so.
> I do  have (Makefile) test, which does show the behaviour described. I'm
> pretty sure, if executed on another machine, it will fail too. I tried
> to add it to the b2-Testsuite but failed due to having static boost in
> dynamic libraries (which is the way I found where it crashes for sure).
> And there is 1 testcase currently in the Boost.Serialization testsuite
> that DOES fail and I'm pretty sure the issue described is the reason but
> of course I cannot really confirm this as the root cause is undefined
> behaviour (or ambiquities with shared libs)
>> Late last year I tried to address this dll/export problem by merging
>> in a fix.  This fix entailed dynamically allocated a data structure in
>> the singleton. This fixed this failing test, but unbeknownst to me,
>> resulted in the memory leak.
> This is what I tried to describe too in the issue and MR: The leak is
> simply due to no one freeing the dynamically allocated memory. This is
> even deterministic and easily reproducible. I think that oversight might
> be caused by you being so used to the code, that you overlook things
> that might be visible to fresh eyes. This is why I would like to also
> have some one else comment on the PR too.
>> The way forward is to add some more tests which isolate the problem.
>> I'm aware that you've submitted one test which fails.  It seems like
>> its on the right track.  But (IIRC), it dynamically loads/unloads DLLS
>> containing user level serialization code in the same sequence.  I
>> think its reasonable to insist that if a user is going to do that that
>> such DLLS be unloaded in reverse order.  I admit that setting up these
>> sorts of tests is difficult and time consuming in any build
>> environment.  But that's where I am.
> I can provide a test for b2 (the same as I have the Makefile for) if the
> boost test suite is build with `-fPIC`. This does not involve unloading
> DLLs or so but simply links an executable against 2 shared libraries
> which are linked against static boost. This crashes reproducibly. If
> anyone can tell me how to do the fPIC stuff for Boost.Build I'm happy to
> add this test to show that it fails with a crash in current code and
> that my approach fixes this.
>
> In any case, I do have the feeling that the reason for the changes done
> in my PR (and the changes itself) are not fully understood. This is why
> I want to encourage question to the changes itself as I think those were
> not (only) meant to fix an issue in some circumstances but actually
> correct the code with test cases to prove this where possible (e.g. the
> is_destroyed flag). Other testcases just try to run into the problem I
> described.

I confirm that the 1.66 version never ever invokes the singleton
destructor declared and defined in boost/serialization/singleton.hpp,
approximately line 175. It does not matter whether it is a debug or
release build, 32-bit, or 64-bit Microsoft or Intel C++ on Windows. A
DLL wrapper around Boost.Serialization is used at run time. The actual
Boost.Serialization library is statically linked to the DLL. The
roughly 20 line main function, test executable then uses the DLL to
invoke the serialization.

I verify the behavior using an old school "printf" approach, with
std::cout messages and static ptrdiff_t counting variables inside the
get_instance() and ~singleton() methods. I observe eleven independent
instances of the singleton class, with ZERO calls to the singleton
destructor. The eleven singletons are created with ONE instance of the
serialization object, regardless of invoking the serialize method.

The main function:

int main()
{
        {
                HINSTANCE hGetProcIDDLL = LoadLibrary(L"BoostSerializeDLL.dll");
        }
        /**/
        std::cout << "Beginning unit test...";
        std::shared_ptr<thislibrary::myclass> myc = thislibrary::create_myclass();
        {
                std::shared_ptr<std::ofstream> ostream =
std::make_shared<std::ofstream>();
                ostream->open("C:\\temp\\leak_test.xml");
                {
                        boost::archive::xml_oarchive xml(*ostream);
                        myc->serialize(xml, 1); // comment this line out and the leak remains.
                }
                ostream->close();
        }
        /* */
        return 0;
}

Fundamentally, there is something between 300 to 400 bytes memory leak.
Yes, it is not much with today's world of GB RAM. However, when you
have production code that leaks, then ANY memory leak is unacceptable.
Essentially, the leak is due to the lack of the eleven singleton
instances NOT invoking the respective destructor. Plus, the group I am
part of spends tens of hours tracking down why the production product
has memory leaks. The final convincing point that must be clearly
understood is this: due to the leak, Boost.Serialization has been
replaced. If that does not convince a reader that there is a real
problem, then nothing will.

Sincerely,

Robert

>
> Best, Alex
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk