Boost logo

Boost :

Subject: Re: [boost] Large Integer Library
From: John Maddock (boost.regex_at_[hidden])
Date: 2012-07-03 04:38:28


> The storage of these types is just the actual numeric data, so I should be
> able to do binary I/O on them etc., which is good. I would perhaps be a
bit
> more confident of this if it were just an array of int32s or int64s,
rather
> than the current recursive organisation.

>I understand your trepidation, but really there should be no reason why
>binary
>I/O would not work provided that the bottom-level types are built-in
>integers
>or an equivalent, of the wordsize or greater. If smaller integer types are
>mixed-in, the compiler could add padding. There is also the question of
>big
>or little endian, which will affect the needed declaration order.

I think chasing binary IO is a fools errand - if you expect the result to be
compatible with builtin integers that is:

1) There's no guarantee that built in types are twos complement.
2) Larger than a single register integers are synthesised by the compiler -
in effect these are compiler supplied "big ints" - as such both the byte
order and limb order are completely arbitrary. For example on a 16 bit
processor (the simplest case), there are two choices for the byte order
within a 16 bit word, and then another two choices for the word order within
compiler synthesised 32-bit type. Unless you're going to write and test
code that handles all possibilities, and figuire out how to configure it so
it does the right thing on all platforms (including all the weird 8 and 16
bit embedded platforms out there). If you're sensible and don't shoot for
that, then there's no difference between the Multiprecision fixed int's and
2's complement ones: both are POD's and can be treated as a platform and
compiler specific "bag of bits".

>> You're using 2's complement, which I believe is the right choice.
>
>Agreed. It both makes the calculations simpler and makes large_int closer
>to the built-in types.

Not so fast: I started off using 2's complement arithmetic for fixed integer
types and was persuaded to change to sign-magnitude by some tests the
Boost.Polygon guys came up with.

Consider multiplication - for any sensible implementation for smallish int's
the complexity is O(N^2), for the Multiprecision lib's fixed int's, N is the
number of limbs that are actually used. For a packed 2's complement type, N
is the total number of limbs in the representation. That means that:

* For the Multiprecision lib, runtime of multiplication is independent of
the size of the type - it depends only on the magnitutde of the number being
stored - so for example
2 * 3 takes (roughly) the size of the type used - whether it's a 128, 256 or
2048-bit int.
* In contrast for a packed 2's complement type the complexity of
multiplication and division grows steaply with increasing bit size.

Now having said that, the sign-magnitude version is more complex, and more
expensive to construct and copy - not by much - but it shows up in tests
such as the delaunay code that Phil posted where a fair proportion of time
is spent constructing the large int's just for one quick calculation. It's
an important use case though, and one I need to come back to.

BTW, I've mentioned this a couple of times but it seems to have slipped
under the radar - the Multiprecision lib already has a packed 2's complement
fixed int back end, it's just that I removed it for the review because I
didn't think it added much. <Shrug> I guess.

> Consider a 256-bit int implemented using 64-bit built-in ints via inter-
> mediate 128-bit ints ... All of second half of that is redundant once you
> know that the least significant 64 bits incremented to a non-zero value
...
> if the implementation were just an array of 4 (u)int64_ts,
> you would iterate with a for loop and break when non-zero.

>This is another thing that would have to be tested. Unfortunately it's
>nigh-on
>impossible to know whether a version with or without more branching ('if'
>or
>'for') would be faster without trying it, and even then on multiple target
>architectures.

I fear Phil is asking the impossible here - you can have something really
simple which is fast in the trivial case (one stop larger than long long)
but rapidly slows for larger cases, or you can do what the Multiprecision
lib does and add all the special case handling which can give
order-of-magnitude improvements for 1024 bits and up, but slows things down
due to more branching by ~50% in the trivial case.

Sometimes you just have to make a choice.

>* Object sizes - you seem to be adding a pointer to all integers
>(presumably
> as a link between the front and back ends).

No, no such pointers anywhere. As mentioned about there is a packed 2's
complement fixed int backend in the sandbox, it just wasn't part of the
review (and yes it's size is exactly N bits, no pointers or other stray
members).

>* This also leads onto forward compatibility and interoperability. If we
>were

I don't believe so - remember that "larger than single register" types are
synthesized by the compiler, so the limb order they use is completely
dependent upon the whim of the compiler writer. Thinking you can be
forwards compatible with types that haven't been written yet is a fools
errand, even if one would hope that compiler writers all do something
sensible ;-)

John.

PS will investigate the assertion you spotted later when I can get 64-bit
Linux back up...


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk