Boost logo

Boost Users :

Subject: Re: [Boost-users] in IOS, thread_info object is being destructed before the thread finishes executing (Andy Weinstein)
From: Andy Weinstein (andyw_at_[hidden])
Date: 2013-02-04 13:00:14


I should note that the boost.sh from Pete Goodliffe (and modified by others) that is commonly used to compile boost for IOS
has the following note in the header:
: ${EXTRA_CPPFLAGS:="-DBOOST_AC_USE_PTHREADS -DBOOST_SP_USE_PTHREADS"}
# The EXTRA_CPPFLAGS definition works around a thread race issue in
# shared_ptr. I encountered this historically and have not verified that
# the fix is no longer required. Without using the posix thread primitives
# an invalid compare-and-swap ARM instruction (non-thread-safe) was used for the
# shared_ptr use count causing nasty and subtle bugs.
#
# Should perhaps also consider/use instead: -BOOST_SP_USE_PTHREADS

I use those flags, but to no avail.

I also found that there is an alternate implementation inside boost for arm processors which seems also to
directly address this issue:
spinlock_gcc_arm.hpp

The version included with boost 1.48 uses outdated arm assembly.
I took the updated version from boost 1.52, but I'm having trouble compiling it.
I get the following error:

predicated instructions must be in IT block

I found a reference to what looks to be a similar use of this instruction here:
https://zeromq.jira.com/browse/LIBZMQ-414

I was able to use the same idea to get the 1.52 code to compile by modifying
the code as follows (I inserted an appropriate IT instruction)
        __asm__ __volatile__(
            "ldrex %0, [%2]; \n"
            "cmp %0, %1; \n"
            "it ne; \n"
            "strexne %0, %1, [%2]; \n"
            BOOST_SP_ARM_BARRIER :
            "=&r"( r ): // outputs
            "r"( 1 ), "r"( &v_ ): // inputs
            "memory", "cc" );

But in any case, there are ifdefs in this file which look for the arm architecture, which is not
defined that way in my environment. After I simply edited the file so that only ARM 7 code
was left, the compiler complains about the definition of BOOST_SP_ARM_BARRIER:

In file included from ./boost/smart_ptr/detail/spinlock.hpp:35:
./boost/smart_ptr/detail/spinlock_gcc_arm.hpp:39:13: error: instruction requires a CPU feature not currently enabled
            BOOST_SP_ARM_BARRIER :
            ^
./boost/smart_ptr/detail/spinlock_gcc_arm.hpp:13:32: note: expanded from macro 'BOOST_SP_ARM_BARRIER'
# define BOOST_SP_ARM_BARRIER "dmb"

So that's where thing stand. I am hopefully awaiting your help….

Thanks!

On Feb 4, 2013, at 3:38 PM, <boost-users-request_at_[hidden]<mailto:boost-users-request_at_[hidden]>>
 wrote:

Send Boost-users mailing list submissions to
boost-users_at_[hidden]<mailto:boost-users_at_[hidden]>

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.boost.org/mailman/listinfo.cgi/boost-users
or, via email, send a message with subject or body 'help' to
boost-users-request_at_[hidden]

You can reach the person managing the list at
boost-users-owner_at_[hidden]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Boost-users digest..."

Today's Topics:

  1. in IOS, thread_info object is being destructed before the
     thread finishes executing (Andy Weinstein)

----------------------------------------------------------------------

Message: 1
Date: Mon, 4 Feb 2013 12:38:29 +0000
From: Andy Weinstein <andyw_at_[hidden]>
To: "Boost-users_at_[hidden]" <Boost-users_at_[hidden]>
Cc: Moshe Rubin <moshe_at_[hidden]>, Ronnie Wulfsohn
<roni_at_[hidden]>
Subject: [Boost-users] in IOS, thread_info object is being destructed
before the thread finishes executing
Message-ID: <5D324B34-BE82-4B4C-89AD-ED8A0095129B_at_[hidden]>
Content-Type: text/plain; charset="us-ascii"

Our project uses a few boost 1.48 libraries on several platforms, including Windows, Mac, Android, and IOS.
We are able to consistently get the IOS version of the project to crash (nontrivially but reliably) when using IOS, and
from our investigation we see that ~thread_data_base is being called on the thread's thread_info while its thread is still running.

This seems to happen as a result of the smart pointer reaching a zero count, even though it is obviously still
in scope in the thread_proxy function which creates it and runs the requested function in the thread.
This seems to happen in various cases - the call stack is not identical between crashes, though there are a few
variations which are common.

Just to be clear - this often requires running code which is creating hundreds of threads, though there are
never more than about 30 running simultaneously. I have "been lucky" and got it very very early in the
run also, but that's rare.
I created a version of the destructor which actually catches the code red-handed:

in libs/thread/src/pthread/thread.cpp:

thread_data_base::~thread_data_base()
{
  boost::detail::thread_data_base* const thread_info=detail::get_current_thread_data();
  void *void_thread_info = (void *) thread_info;
  void *void_this = (void *) this;
  // is somebody destructing the thread_data other than its own thread?
  // (remember that its own which should no longer point to it anyway,
  // because of the call to detail::set_current_thread_data(0) in thread_proxy)
  if (void_thread_info) { // == void_this) {
    __builtin_trap();
  }
}

I should note that (as seen from the commented-out code) I had previously checked to see that void_thread_info == void_this because I
was only checking for the case where the thread's current thread_info was killing itself.
I have also seen cases where the value returned by get_current_thread_data is non-zero and
different from "this", which is really weird.

Also when I first wrote that version of the code, I wrote:

if (((void*)thread_info) == ((void*)this))

and at run-time I got some very weird exception that said I something about a virtual function table
or something like that - I don't remember. I decided that it was trying to call "==" for this object type
and was unhappy with that, so I rewrote as above, putting the conversions to void * as separate
lines of code. That in itself is quite suspicious to me. I am not one to run to rush to blame compilers, but...

I should also note that when we did catch this happening the trap, we saw the destructor for
~shared_count appear twice consecutively on the stack in Xcode source. Very doubleweird.
We tried to look at the disassembly, but couldn't make much out of it.

Again - it looks like this is always a result of the shared_count which seems to be owned by
the shared_ptr which owns the thread_info reaching zero too early.

Help!

Thanks,

Andy Weinstein

PS I cannot say for sure yet whether this problem occurs on the other platforms, though it looks like
Windows is more solid that Mac. Android pretty solid also, though maybe in between Windows
and Mac.
-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

Subject: Digest Footer

_______________________________________________
Boost-users mailing list
Boost-users_at_[hidden]
http://lists.boost.org/mailman/listinfo.cgi/boost-users

------------------------------

End of Boost-users Digest, Vol 3349, Issue 2
********************************************



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net