Boost logo

Boost :

Subject: Re: [boost] Large Integer Library
From: Michael Tryhorn (mjtryhorn_at_[hidden])
Date: 2012-07-02 14:43:52


John, Chris, and anyone else interested,

Firstly, to answer a question asked by Marc Glisse, the 'l' prefix on all
l[u]int types defined by large_int/def.hpp is included to distinguish the
potentially large_int based types from those provided by cstdint or
equivalent.
So, these 'l' prefixed types could be imported into the current or global
namespace without interfering with any current or future built-in
equivalents.
I'm not a fan of 'using namespace x::y::z;', but I know some people do use
it.

Now onto testing. I've tried the given example code upon my local test
machine and can corroborate your results. I'm impressed - the performance
of
operator* at least is astounding (even for 1024bit integers). I have tried
tweaking my classes here-and-there, and have even managed to halve the time
it
takes large_int::operator* to run (by adding special 'left_shift_by_one' and
'right_shift_by_one' private operations). I agree that what is needed is a
replacement of the algorihms rather than their tweaking, as the growth rate
against number-of-bits is rather high.

It seems that other operators like construction/destruction operator+ and
operator- are comparable on speed. This is approximate, but I observed
large_int being quicker on simple operations like '+', '-' and so on at
128bit,
equivalent at 256bit and slower at 1024. But, considering the overall
speed of
both Boost.LargeInt and Boost.Multiprecision in this respect, I doubt that
either would ever be seen as a bottleneck in any real application.

However, I do now have a couple of genuine points of difference between
Boost.Multiprecision and Boost.LargeInt as it is now:

* Object sizes - you seem to be adding a pointer to all integers (presumably
  as a link between the front and back ends). I can understand the need for
  it, but this does increase the required memory to store arrays of them.
As
  boost::large_int::large_int classes are fully concrete and only ever
consist
  of their two internal parts, they have the same size as what would be
their
  built-in equivalent. Meaning, sizeof( int128_t ) == sizeof( lint128_t ),
  unless the compiler decides to add class padding but I've never seen
padding
  added to a large_int object.

* This also leads onto forward compatibility and interoperability. If we
were
  to serialize a large_int instance, it should have the same content as its
  equivalent built-in type. This aids transparency of use, although it is
  currently only true upon little-endian machines. Big-endian support could
  be added by swapping the internal m_hi and m_lo large_int data members and
  adding some 'ntohl'/'htonl' equivalent free-functions in the
boost::large_int
  namespace.

In conclusion, it is obvious that we are in agreement that support for
arbitary sized integers for C++ is required. I still believe that
Boost.LargeInt is a good contender. It is has both the above points and an
elegant design (even if I do say so myself), and would just need
improvements
to its integral division and multiplication. Unfortunately, that is beyond
my
expertise, hence the current "simple but it works" solution.

If you have any suggestions on how this could be improved or merged into
existing work, I would be more than happy to collaborate. Any further
questions or comments would of course be appreciated.

(Phil, I've seen your email and I will respond to it separately.)

Thanks,

Michael J. Tryhorn

P.S. I couldn't get the above testing code to run on either of my local
64bit machines, and had to resort to an old 32bit AMD machine which is
slowly
gathering dust. This is after inserting Boost.Multiprecision into Boost
v1.50.0.

System: Ubuntu 11.10 (64-bit)
          Intel® Core™2 Quad CPU Q6600 @ 2.40GHz × 4
Compiler: gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1
Command: g++ -Wall -ggdb -O3 -march=native -I<snip>/include -L<snip>/lib \
 -o lint_tst lint_tst.cpp -Wl,-Bstatic -lboost_chrono -lboost_system \
 -Wl,-Bdynamic -lrt
Output:
32 bit time = 0.00177922 seconds
64 bit time = 0.00188884 seconds
lint_tst: <snip>/cpp_int.hpp:1090: void
boost::multiprecision::backends::eval_multiply(boost::multiprecision::backends::cpp_int_backend<MinBits,
Signed, Allocator>&, const
boost::multiprecision::backends::cpp_int_backend<MinBits, Signed,
Allocator>&, const
boost::multiprecision::backends::cpp_int_backend<MinBits, Signed,
Allocator>&) [with unsigned int MinBits = 128u, bool Signed = true,
Allocator = void]: Assertion
`(std::numeric_limits<double_limb_type>::max)() - carry >
static_cast<double_limb_type>(cpp_int_backend<MinBits, Signed,
Allocator>::max_limb_value) *
static_cast<double_limb_type>(cpp_int_backend<MinBits, Signed,
Allocator>::max_limb_value)' failed.

System: openSUSE 12.1 (64-bit)
          Intel® Xeon® CPU E5410 @ 2.33GHz × 4
Compiler: gcc (SUSE Linux) 4.6.2
Command: [same as above]
Output:
32 bit time = 0.00183981 seconds
64 bit time = 0.00189349 seconds
lint_tst: <snip>/cpp_int.hpp:1090: void
boost::multiprecision::backends::eval_multiply(boost::multiprecision::backends::cpp_int_backend<MinBits,
Signed, Allocator>&, const
boost::multiprecision::backends::cpp_int_backend<MinBits, Signed,
Allocator>&, const
boost::multiprecision::backends::cpp_int_backend<MinBits, Signed,
Allocator>&) [with unsigned int MinBits = 128u, bool Signed = true,
Allocator = void]: Assertion
`(std::numeric_limits<double_limb_type>::max)() - carry >
static_cast<double_limb_type>(cpp_int_backend<MinBits, Signed,
Allocator>::max_limb_value) *
static_cast<double_limb_type>(cpp_int_backend<MinBits, Signed,
Allocator>::max_limb_value)' failed.

(Same failure at other optimization levels, including O0, O1 and O2.)

This failure does not occur when running the same test on 32bit openSUSE
12.1,
and so may be integer-size related. Here are the sizes of the built-in
types,
for reference:

openSUSE 32bit:
sizeof( short ) == 2
sizeof( int ) == 4
sizeof( long ) == 4
sizeof( long long ) == 8
sizeof( float ) == 4
sizeof( double ) == 8
sizeof( long double ) == 12
sizeof( mp_uint128_t ) == 20
sizeof( mp_uint256_t ) == 36
sizeof( mp_uint512_t ) == 68
sizeof( mp_uint1024_t ) == 132

openSUSE/Ubuntu 64bit:
sizeof( short ) == 2
sizeof( int ) == 4
sizeof( long ) == 8
sizeof( long long ) == 8
sizeof( float ) == 4
sizeof( double ) == 8
sizeof( long double ) == 16
sizeof( mp_uint128_t ) == 24
sizeof( mp_uint256_t ) == 40
sizeof( mp_uint512_t ) == 72
sizeof( mp_uint1024_t ) == 136

On 2 July 2012 18:38, Phil Endecott <spam_from_boost_dev_at_[hidden]>wrote:

> Hi Michael,
>
>
> Michael Tryhorn wrote:
>
>> I have written a Boost-compatible library for the representation and
>> manipulation of integers of large or arbitrary size (including 128bit and
>> above). I would like to make a preliminary submission of this library for
>> your review and input, for potential inclusion into Boost v1.51.0 or
>> later,
>> as per the Boost Library Submission Process.
>>
>
> It would have been great if you could have joined the discussion a couple
> of weeks ago. Perhaps it is not too late for you to write a review of the
> proposed Boost.Multiprecision, since you clearly have some experience of
> the subject.
>
> I have had a quick look at your code and my initial thoughts are:
>
> - The storage of these types is just the actual numeric data, so I should
> be able to do binary I/O on them etc., which is good. I would perhaps be a
> bit more confident of this if it were just an array of int32s or int64s,
> rather than the current recursive organisation.
>
> - You're using 2's complement, which I believe is the right choice.
>
> - The simplicity of the code means it could be relatively bug-free, if
> we're lucky.
>
> - Bitwise multiplication and division is completely inappropriate.
>
> - You have an idiom of working on a temporary and them copying the result
> to *this to "commit" it, e.g.
>
> template<class T, class U>
> large_int<T, U>& large_int<T, U>::operator++ ()
>
> {
> // Take a copy of *this, to modify
> this_type ret(*this);
>
> // Perform the increment
> ++ret.m_lo;
> if( ret.m_lo == low_part_type(0) )
>
> {
> ++ret.m_hi;
> }
>
> // Commit the result
> return( *this = ret );
> }
>
> I don't see what the benefit of that is; it just makes ++ much slower than
> it otherwise would be.
>
> This example, operator++, also illustrates what may be a weakness of your
> recursive style. Consider a 256-bit int implemented using 64-bit built-in
> ints via intermediate 128-bit ints. The increment does this:
>
> increment the 128-bit m_lo:
> increment the 64-bit m_lo (builtin)
> test the 64-bit m_lo for zero (builtin) (assume common-case, not
> zero)
> test the 128-bit m_lo for zero (not built-in, impl in traits.hpp)
> test the 64-bit m_hi
> test the 64-bit m_lo
>
> All of second half of that is redundant once you know that the least
> significant 64 bits incremented to a non-zero value, but it is far from
> obvious to me that the compiler would successfully optimise it away. On
> the other hand, if the implementation were just an array of 4 (u)int64_ts,
> you would iterate with a for loop and break when non-zero.
>
> - It would be great to see "hooks" where assembler implementations could
> be inserted.
>
> - It would be interesting to know if your code could function as a "back
> end" to the proposed Boost.Multiprecision, using that library's expression
> templates to eliminate temporaries. There are two things that could be
> learnt from that: does that help performance, and does the Multiprecision
> front/backend split work?
>
>
> Regards, Phil.
>
>
>
>
>
>
> ______________________________**_________________
> Unsubscribe & other changes: http://lists.boost.org/**
> mailman/listinfo.cgi/boost<http://lists.boost.org/mailman/listinfo.cgi/boost>
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk