|
Boost : |
From: Matt Borland (matt_at_[hidden])
Date: 2025-01-21 18:12:14
On Tuesday, January 21st, 2025 at 12:56 PM, Ivan Matek via Boost <boost_at_[hidden]> wrote:
>
>
> On Tue, Jan 21, 2025 at 6:06â¯PM Peter Dimov pdimov_at_[hidden] wrote:
>
> > Basically, _fast types are almost never fast. Let's hope this curse doesn't
> > afflict Decimal _fast types as well. :-)
> >
> > Thank you for confirming this, now I regret not writing that also in
>
> review. :)
>
> To recap the discussion wrt pass by reference:
> points we agree on(wrt 64 bit x86):
>
> - if we pass by value on Windows we are out of luck for most decimal
> types since there is ABI limitation for types greater than 8 bytes
> - use of uint16_fast_t in implementation pushed decimal64_fast over the
> limit of Linux ABI(16 bytes), _fast std:: types are not fast, would be nice
> to change this even if will not help on Windows
>
>
> points we disagree on:
>
> - pass by reference for large types (and if necessary mutate inplace) is
> still my prefered API. If we do want value returning functions then for
> large types I would prefer to pass args by const reference.
>
>
> Not trying to change your mind, just recapping above discussion, doing all
> sizeof and ABI math is tricky.
>
Here's a data point on macOS ARM64:
Benchmarks on Current Develop:
===== Comparisons =====
comparisons<dec32_fast >: 555534 us (s=29999985)
comparisons<dec64_fast >: 680204 us (s=29999985)
comparisons<dec128_fast>: 598125 us (s=29999985)
===== Addition =====
Addition<dec32_fast >: 1112121 us
Addition<dec64_fast >: 1282197 us
Addition<dec128_fast>: 6967490 us
===== Subtraction =====
Subtraction<dec32_fast >: 930937 us
Subtraction<dec64_fast >: 1127780 us
Subtraction<dec128_fast>: 3645142 us
===== Multiplication =====
Multiplication<dec32_fast >: 776308 us
Multiplication<dec64_fast >: 1145653 us
Multiplication<dec128_fast>: 17949983 us
===== Division =====
Division<dec32_fast >: 904825 us
Division<dec64_fast >: 1659034 us
Division<dec128_fast>: 1648759 us
Every uint_fastXX_t or int_fastXX replaced by uintXX_t or intXX_t:
===== Comparisons =====
comparisons<dec32_fast >: 586511 us
comparisons<dec64_fast >: 709211 us
comparisons<dec128_fast>: 657634 us
===== Addition =====
Addition<dec32_fast >: 1121786 us
Addition<dec64_fast >: 1289409 us
Addition<dec128_fast>: 7020519 us
===== Subtraction =====
Subtraction<dec32_fast >: 963632 us
Subtraction<dec64_fast >: 1160965 us
Subtraction<dec128_fast>: 3695792 us
===== Multiplication =====
Multiplication<dec32_fast >: 800159 us
Multiplication<dec64_fast >: 1179959 us
Multiplication<dec128_fast>: 18459046 us
===== Division =====
Division<dec32_fast >: 928643 us
Division<dec64_fast >: 1683552 us
Division<dec128_fast>: 1684860 us
So we'd need to investigate all the different ABIs and switch the type on platform. I'll note that my primary development is on ARM Mac. Another win for Intel's complete vertical integration.
Matt
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk