Boost logo

Boost :

From: Michael van der Westhuizen (r1mikey_at_[hidden])
Date: 2006-07-04 15:21:21


Hi Tom,

On 7/4/06, Tomas Puverle <Tomas.Puverle_at_[hidden]> wrote:
> > Could you send the sunpro versions to me (or the list) please? I'll
> > see what I can do about getting them into asm() blocks in the morning.
>
> I have tried and tried a while ago. :) My conclusion was that unfortunately
> register allocation doesn't play nicely with inlining. Also, returning a
> parameter is REALLY hard to do and I wasn't able to get it to work in the
> general case.

Yes, that's the nightmare part I was talking about :-)

> The atomics for sunpro are pretty much identical to the gcc
> version, however, because the asm() block is inserted at a different stage of
> compilation than the .il file, the block doesn't recognise the synthetic
> instructions such as cas or casx. You will need to use the real version of
> the instruction, which is CASA and CASXA, so your asm block will look like
> this (if my memory serves me well):

[snip]

> If you can get this to work, I'd love to hear back from you becase I
> absolutely loathe the .il model. It really makes it difficult to write libs,
> because you have to do special magic to add the .il to the compile line when
> your library gets included.

We are starting to drift a little off topic now, but if the list
doesn't mind indulging us :-)

As Tom correctly states above, inlining a function containing an asm()
call makes it difficult /impossible to refer to registers when
compiling with Sun Studio.

The following technique is not suitable for applications which accept
the overhead of a function call, but it is suitable for libraries
wanting to remain header-only while using inline assembly.

All of that aside, this technique also smells like a hack, but it does work.

What you do is create your functions containing your inline assembly
as a template in an anonymous namespace in your header, like so:

namespace
{
// effects: (*target)++;
template <typename T>
void templated_atomic_inc_32(volatile uint32_t *target)
{
#if defined(__i386)
    asm(".volatile \n\
        movl 8(%ebp), %eax \n\
        lock \n\
        incl (%eax) \n\
        .nonvolatile \n\
    ");
#else
# error Port me
#endif
}
}

Then give it a "normal" name, using an inline function, like this:

inline void my_atomic_inc_32(volatile uint32_t *target)
{
    templated_atomic_inc_32<bool>(target);
}

Then just use "my_atomic_inc_32" wherever you need it, and you won't
have to drag .il files around with you!

I've attached my proof-of-concept source. This compiles and works
predictably with or without inlining, and in debug or at the highest
optimisation levels.

Tom, I hope this is the solution you were looking for - if your
performance requirements can accept the extra function call, then
this should work for you.

Michael


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk