|
Boost : |
From: Peter Dimov (pdimov_at_[hidden])
Date: 2025-01-21 15:20:02
Ivan Matek wrote:
> Trivially copyable types up to 2*uint64_t in size are passed in
> registers (on non-Windows x86-64), so pass by reference will be a
> performance regression, probably a significant one.
>
>
>
> If you have time I suggest to read the PDF. I know your time is valuable, but I
> just think it is faster if we are on same page.
I did read the PDF.
> In this particular case I tried to be pretty clear in document about this. I was
> not talking about small types.
So, what types were you talking about? decimal32_fast is this
public:
using significand_type = std::uint_fast32_t;
using exponent_type = std::uint_fast8_t;
private:
significand_type significand_ {};
exponent_type exponent_ {};
bool sign_ {};
which is 64 bit.
decimal64_fast is this
public:
using significand_type = std::uint_fast64_t;
using exponent_type = std::uint_fast16_t;
private:
significand_type significand_ {};
exponent_type exponent_ {};
bool sign_ {};
which is 128 bits.
Only decimal128_fast doesn't fit in two registers.
> Also not all functions take 1 argument, in PDF I
> explicitly used copysign as example.
x86-64 uses up to 6 registers for parameter passing (RDI, RSI, RDX, RCX,
R8, R9), which means that up to three 128 bit trivially copyable types
can be passed in registers when pass by value is used.
copysign only needs two.
> And it may be possible that even if decimal128 pass by value is faster
> decimal128_fast pass by value is not.
It's true that it's probably better to pass decimal128_fast by reference,
but the API inconsistency is probably not worth it.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk