Boost logo

Boost :

Subject: Re: [boost] Algebraic abstractions related to (big) integers in C++
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2011-03-31 02:48:20


Phil Endecott wrote:
> pavel wrote:
>> i think that ideally a feature like 'add_with_carry()' should be
>> available as a compiler intrinsic
>
> I had a look at this for the XInt review, when I wrote some ARM
> assembler here:
>
> http://article.gmane.org/gmane.comp.lib.boost.devel/215565
>
> The relevant code is this snippet to do a 96-bit add:
>
> "adds %0, %3, %6\n" // Add and set carry flag
> "adcs %1, %4, %7\n" // Add with carry-in and set carry flag
> "adc %2, %5, %8\n" // Add with carry-in.
>
> The difficulty is that there is only one carry flag and that it is
> implicit. So in a compiler intrinsic for add-with-carry you need to
> either: - Keep the carry implicit, and somehow rely on the compiler
> not
> inserting any instructions that change it in the generated instruction
> stream (which seems impractical), or
> - Make it explicit, but since there is only one place that it can be
> stored, the compiler will have a challenge to generate code, or
> - Transfer it to and from a general-purpose register, which will
> greatly reduce the speed (back to what you can get without assembler
> or intrinsics).
>
> I came to the conclusion that it is better to write multi-word
> addition
> code (like the above) in assembler for each platform.
>
> I believe that the issues are similar on other architectures that have
> a flag register, but maybe others can confirm or deny that.
>
> Any thoughts anyone?

This is the add with carry instruction I'm most familiar with using:
http://software.intel.com/en-us/articles/prototype-primitives-guide/
ADC_PI - Add Int32 Vectors with Carry
Performs an element-by-element three-input addition between int32 vector v1, int32 vector v3, and the corresponding bit of vector mask k2. The carry from the sum for the nth element is written into the nth bit of vector mask k2_res.
_M512I _mm512_adc_pi(_M512I v1, __mmask k2, _M512I v3, __mmask *k2_res)
_M512I _mm512_mask_adc_pi(_M512I v1, __mmask k1, __mmask k2, _M512I v3, __mmask *k2_res)

It is 16 wide 32bit integer add with carry that stores the resulting output carry bits in a 16 bit mask register and accepts a mask register as the carry in as well.

Regards,
Luke

 


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk