# Boost :

Subject: [boost] Algebraic abstractions related to (big) integers in C++
From: Thomas Klimpel (Thomas.Klimpel_at_[hidden])
Date: 2011-03-26 19:21:04

Dear (big) integer enthusiasts,

The goal of this message is to reflect on the two's complement representation of negative integers, from a relatively abstract algebraic perspective. Some of the questions that arise in this context are whether the two's complement representation would be well suited for

(i) a fixed width integer data type
(ii) a fixed point number data type
(iii) a variable width integer data type
(iv) a floating point number data type

Circumstantial evidence suggests that it's well suited for (i) and (ii), but not well suited for (iv). The following quote from Bill Gosper (1972) "By this strategy, consider the universe, or, more precisely, algebra [...] therefore algebra is run on a machine (the universe) which is twos-complement." (<http://www.inwap.com/pdp10/hbaker/hakmem/hacks.html#item154>) could be seen as evidence in favor of using two's complement representation for (iii). However, XInt and other bigint libraries took the opposite decision, so there is strong circumstantial evidence against using two's complement representation for (iii).

Let me start with a disclaimer that I'm not an expert in algebra. The following thoughts emerged when I tried to understand the semi-direct product. I wanted to see how it works for a concrete "small" group. So I tried to write the Quaternion group (the one with the 8 elements 1, -1, i, -i, j, -j, k, -k) as a semi-direct product. This didn't work, because the Quaternion group is not a semi-direct product. But after some time I managed to "decompose" the Quaternion group into abelian groups in a different way. I subsequently learned that this sort of problem is called the group extension problem <http://en.wikipedia.org/wiki/Group_extension>. It's a solved problem, but I would have to learn some homological algebra to understand its solution.

Let's look at the "ring of integers" (denoted by "Z") and the "ring of integers modulo m" (denoted by "Z_m"). We want to "approximate" the "ring of integers" by the "ring of integers modulo m". One of the advantages of this approach is that it's relatively easy to represent the "ring of integers modulo m" in a computer, especially for "m = 2^n" (more concretely "m = 2^32" for int32_t and "m = 2^64" for int64_t).

There exists exactly one non-trivial ring homomorphism "p : Z -> Z_m" from the "ring of integers" into the "ring of integers modulo m". (The integer "1" is mapped to the corresponding "1", and the additive group of the "ring of integers modulo m" is generated by "1".) A good way to describe how we "approximate" "Z" by "Z_m" is to define a map "q : Z_m -> Z" such that "p(q(i)) = i for all i". Because the map "q" is not uniquely determined by the condition "p(q(i))=i", let's add the additional constraint "-m/2 <= q(i) < m/2". It's easily checked that this uniquely determines the map "q".

This map "q" is not a ring homomorphism, but that's OK as there does not exists any non-trivial ring homomorphism from "Z_m" into "Z". Still we should try to measure the deviation of "q" from a ring homomorphism by taking a look at "S(a, b) := q(a) + q(b) - q(a + b)" and "P(a, b) := q(a) q(b) - q(a b)". It's easily checked that "p(S(a, b)) = 0" and "p(P(a, b)) = 0". This shows that "S(a, b)" and "P(a, b)" are multiples of "m". A quick estimation shows "-1 <= S(a, b)/m <= 1" and "-(m+1)/4 < P(a, b)/m <= (m+2)/4". This doesn't look too bad, but still bad enough to question whether the constraint "-m/2 <= q(i) < m/2" is really the way to go.

Instead of the constraint "-m/2 <= q(i) < m/2", we could have also used the constraint "0 <= q(i) < m" or the constraint "-m/2 < q(i) <= m/2". One reason to prefer the constraint "-m/2 <= q(i) < m/2" over the constraint "-m/2 < q(i) <= m/2" is that we exactly get the two's complement representation of signed integers for "m = 2^n", which was the very reason that we investigated these algebraic relations in the first place. It's also more consistent with the "a <= x < b" representation of ranges so typical for C/C++.

I want to give a third reason why the constraint "-m/2 <= q(i) < m/2" should be preferred over the constraint "-m/2 < q(i) <= m/2". Let's define a "round to nearest" division and the corresponding modulo operation for the "ring of integers" by "rdiv(a, b) := floor(1/2 + a/b)" and "rmod(a, b) := a - b rdiv(a, b)". It's easily checked that "-m/2 <= rmod(i, m) < m/2". It could be objected that implementing "round to nearest" by "i = floor(0.5 + x);" is only correct for non-negative values of "x". The correct implementation of "round to nearest" seems to be "i = x >= 0 ? floor(x + 0.5) : ceil(x - 0.5);". But I don't think that the more complicated definition of "round to nearest" is always "more" correct than the simpler one. At least when manipulating polygons, there can be situations where the simpler definition turns out to be the "more" correct one.

But let's investigate to constraint "0 <= q(i) < m" now. A quick estimation shows "0 <= S(a, b)/m <= 1" and "0 <= P(a, b)/m < m-1". So "S(a, b)/m" could be stored in a single "carry bit" and "P(a, b)/m" can be stored as a value of "Z_m". It is therefore not uncommon to have access to "S(a, b)/m" in machine language via a "carry bit" and to "P(a, b)/m" by using a machine instruction that computes "a b" and "P(a, b)/m" at the same time and stores "a b" and "P(a, b)" in different registers. (My experiences with machine language are from the time before 1996, so pardon me if these statement should not be fully correct.) The statement about the carry bit is not fully correct, because the carry bit is normally not just a result of an addition in machine language, but is also taken into account during the addition. So if "c" is the carry bit before the addition, the addition will return the value "p(c) + a + b" in "Z_m" and the new carry bit "(c + q(a) + q(b) - q(p(c) + a + b))/m". It's probably even possible to write this "add_with_carry" operation as a function in C or as a template in C++. But I don't really believe that many compilers will notice the equivalence between a template like the following and the "add_with_carry" machine instruction.

template <Uint>
typename boost::enable_if<boost::is_unsigned<Uint>, bool>::type
add_with_carry(Uint& a, const Uint& b, bool c) {
a += (c != 0);
a += b;
return (c != 0) ? (a <= b) : (a < b);
}

And I'm pretty certain that there doesn't exist any compiler will notice the equivalence between a template like the following and the mentioned machine instruction for a multiplication returning the complete result. I'm not even certain myself that they are equivalent (but I could write a test to verify it for me, if I really wanted to be certain).

template <Uint>
typename boost::enable_if<boost::is_unsigned<Uint>, Uint>::type
multiply_full_result(Uint& a, const Uint& b) {
Uint upper_a = (a >> (sizeof(Unit)/2));
Uint upper_b = (b >> (sizeof(Unit)/2));
Uint lower_a = a & (Uint(-1) >> (sizeof(Unit)/2));
Uint lower_b = b & (Uint(-1) >> (sizeof(Unit)/2));
Uint upper = upper_a * upper_b;
upper += ((upper_a*lower_b) >> (sizeof(Unit)/2));
upper += ((lower_a*upper_b) >> (sizeof(Unit)/2));
a *= b;
return upper;
}

I'm starting to get problems with the length of this message now, because it is quite reasonable to expect that the number of people that will even try to read the message gets lower and lower as the message gets longer and longer. And this is already my third start from scratch trying to come up with something reasonable in size and readability. My first attempt directly started with p-adic numbers and p-adic integers, then tried to explain how the two's complement representation is related to 2-adic integers. It noted that it is indeed an approximation in the 2-adic metric, but this normally doesn't buy us anything, because the 2-adic metric is almost never the relevant metric for our problem domain at hand. This is also one of the reasons why the two's complement representation should be avoided for floating point numbers, because the relevant metric in this case is the normal metric of the real numbers.

My second attempt tried to take the opposite approach, and started with the C++ statement "std::size_t n = std::max(1, c.size());", noting that it will provoke a compile error. After some discussion and additional examples like the expressions "c.size()/-2" and "-c.size()/2", it claimed that Haskell was able to solve these problems by not assigning any concrete type to "2" at all, but just the abstract "Num" typeclass. One part of the problems that are meant here is described in the following statement:

"What I consider especially strange from an algebraic point of view, is to automatically promote a signed integer type to its corresponding unsigned integer type, when an arithmetic expression contains both signed and unsigned integer types. Shouldn't it be the other way round? Or wouldn't it be even better, if there would be no automatic conversion at all? At least from an algebraic point of view, any natural number is also an integer, not vice versa like in C."

So let's try to come to some sort of conclusion now, even so not all semantical issues I wanted to discuss have been raised, and nearly no solutions for the raised issues have been presented. It looks to me now like it can be advantageous to implement operations for signed (variable or fixed width) integer data types in terms of operations for unsigned integer data types. This can be done independent of whether the two's complement representation is used or not. However, we should still be careful to provide the correct semantic for signed integer types, even if we implement them in terms of unsigned integer types. It is a questionable design when the TTMath bignum library lets it signed integer type derive public from its unsigned integer type ("template<uint value_size> class Int : public UInt<value_size> { ... };"), because the relationship between signed and unsigned integer types is not a "is a" relationship as required for public inheritance. (Think about it, a signed integer is really not an unsigned integer, even if the unsigned integer type would model the "ring of integers modulo m" instead of the "natural numbers".) Is is also questionable when a relatively new language like the D programming language cripples its "Usual Arithmetic Conversions" by stating "The signed type is converted to the unsigned type." as rule 4.4. Even so it is true that an integer can be converted to an "integer modulo m" (remember the ring homomorphism "p : Z -> Z_m"), are there really good reasons to apply such a conversion automatically? For the C/C++ language, the answer is definitively yes: "backward compatibility". But I don't think that this reason would have applied to D. And please pardon me if I imply press the send button now.

Regards,
Thomas