Boost Users :

Date view	Thread view	Subject view	Author view

From: John Maddock (jz.maddock_at_[hidden])
Date: 2021-08-20 16:42:38

Next message: Neill Clift: "Re: [multiprecission] Large divides layered on 128 bit divides"
Previous message: Neill Clift: "[multiprecission] Large divides layered on 128 bit divides"
In reply to: Neill Clift: "[multiprecission] Large divides layered on 128 bit divides"
Next in thread: Neill Clift: "Re: [multiprecission] Large divides layered on 128 bit divides"
Reply: Neill Clift: "Re: [multiprecission] Large divides layered on 128 bit divides"

On 19/08/2021 23:02, Neill Clift via Boost-users wrote:
>
> Hi,
>
> The architecture of cpp_int being build on 64 bit arithmetic using 128
> bit double_limb_type is interesting.
>
> I have a question on the large divide (divide_unsigned_helper). It
> uses the upper portions of the large integers to get an estimation of
> the quotient. Subtracts out a multiple of that quotient and repeats.
>
> It does this in 128 bit values if available from the compiler:
>
>          double_limb_type a =
> (static_cast<double_limb_type>(prem[r_order]) << CppInt1::limb_bits) |
> prem[r_order - 1];
>
>          double_limb_type b = py[y_order];
>
>          double_limb_type v = a / b;
>
> The compiler emulates this operation in the routine __udivmodti4 which
> itself uses an iterative approach.
>
> It seems to use a pretty basic shift and subtract algorithm mind you.
>
> As a general rule for multiprecision is it OK to layer the Knuth like
> algorithm D on top of each other this way.
>
> I have no idea myself but wonder if this is a known issue. I would
> have guessed that it made sense to do a 64 by 64 bit divide to guess
> the quotient and repeat.
>
Good question.

As I recall I tried both single-limb and double-limb partial-quotients
and the double-limb (128 bit) version was slightly faster.

There is a balance here between removing as large a chunk as you can
with each loop, compared to more expensive operations within the loop.
You could for example, perform "Karatsuba-like" division by splitting a
B-bit numerator into two B/2 chunks and perform schoolboy division on
the two "digit" numbers. But the fact that no-one seems to have done
this suggests how well it must work ;) On the other hand, __int128,
while a synthetic type, is sufficiently well optimised for this to be a
useful chunk size.

HTH, John.

-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Next message: Neill Clift: "Re: [multiprecission] Large divides layered on 128 bit divides"
Previous message: Neill Clift: "[multiprecission] Large divides layered on 128 bit divides"
In reply to: Neill Clift: "[multiprecission] Large divides layered on 128 bit divides"
Next in thread: Neill Clift: "Re: [multiprecission] Large divides layered on 128 bit divides"
Reply: Neill Clift: "Re: [multiprecission] Large divides layered on 128 bit divides"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net