Boost logo

Boost :

Subject: Re: [boost] Help needed for shared_ptr issue on iPad2 (dual core ARM)
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2011-03-28 12:23:10


Hi Peter,

Peter Dimov wrote:
> Ticket #5372:
>
> https://svn.boost.org/trac/boost/ticket/5372
>
> says that shared_ptr's ARM spinlock implementation (which uses the swp
> instruction) doesn't work properly on iPad2 (which has a dual core ARM
> processor). The sample program in the ticket compares it to a loop using
> __sync_fetch_and_add, which means that the __sync intrinsics are implemented
> by the compiler the submitter is using. These didn't work on gcc for ARM
> when we tested them, but may have been added meanwhile. (I can see some code
> samples that test for 4.4, but the official docs state that ARM intrinsics
> are only supported on Linux before 4.6, which was released yesterday.)
>
> So, we have two questions; first, why does the swp-based spinlock fail, and
> second, how can we detect support for __sync intrinsics and use them.
>
> Anybody with ARM knowledge and iPad2 development access?

First let me say that the "right way" to fix this is surely to get
Boost.Atomic finished and to use that as the basis of shared_ptr. I've
contributed ARM code for Boost.Atomic that knows about the different
architecture versions and will use ldrex/strex on ARMv7 (though it
needs some attention from someone who knows more than I do about memory
barriers, and it has had very little testing). I also have a trivial
sp_counted_base_atomic.hpp that uses it. These are in use in a number
of iPad apps and I've not yet had any reports of problems on the iPad 2
(fingers crossed). It seems that perhaps Helge doesn't have enough
free time to finish this off - in that case, I think it's a
sufficiently important library that we should perhaps consider how we
can help to progress it. I could certainly contribute a modest amount
of time and testing resource to it.

In the meantime, my understanding is that SWP is "deprecated" in ARMv7
- except that it is a peculiarly strong kind of deprecation where you
have to turn on a bit in a control register to enable it. I have asked
Apple what they do with this bit on the iPad 2 (where of course the
lockdown means individual apps cannot change it) and I await an answer.

One other issue is that even when enabled, SWP might not have the
required memory barrier semantics on the multi-processor systems, i.e.
you might need to put explicit barrier instructions either side of it.
I'm uncertain about this; it doesn't help that the ARMv7 architecture
documents are still only available under NDA. Anyone here have copies?

I don't yet have an iPad 2, but will eventually; I do have another
dual-core ARM box with an Nvidia Tegra 2 chip, but I'm not sure if
anything useful can be learnt from testing on it.

> how can we detect support for __sync intrinsics and use them.

I believe that there is a macro something like
__GCC_HAVE_SYNC_COMPARE_AND_SWAP__. The difficulty is that it was
introduced well after the actual intrinsics were added, so there are
gcc versions that do have the intrinsics but not the macro. Last time
I checked, this was too much of an issue to ignore. Maybe things have
moved on enough that this macro could now be used.

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk