Boost logo

Boost :

Subject: Re: [boost] Looking for some "real world" extended precision integer arithmetic tests
From: Andrii Sydorchuk (sydorchuk.andriy_at_[hidden])
Date: 2012-03-22 09:36:28


Ah, you probably misunderstood what I meant.

Yes, I misunderstood that you changed your fixed int implementation to a
new one.

There are other similar examples, but basically my case is that to make the
> integer type general purpose there are actually lots of these special case
> checks, each of which are there for a very good reason, but which
> collectively easily account for that 10% slowdown in my code compared to
> yours.

Well, as I noticed before it's not just 10%, because algorithm execution
time should be subtracted. Nevertheless I completely understand that it is
not possible to get as fast code as mine. I would consider generalized
implementation that is 50% slower to be pretty good (but not 3 times slower
as before). It seems that you manage to do that so accept my
congratulations! I am willing to have a look at your implementation, is it
already in sandbox?

Again, another compromise: sign magnitude is actually very slightly slower
> than the old 2's complement code for some operations, but has enough other
> advantages to make it worthwhile (so you sold me on that)

Could you list those? I am willing to test :-)

It also has a template parameter that lets you tune the size of the
> internal cache (the bit that's used before any memory allocation takes
> place).

I am not sure what is internal cache here. Need to have a look at your
implementation. Did you implement two separate backends: stack allocated,
heap allocated? Or did you come up with generalized implementation that
uses fixed array for small numbers and switches to dynamic array if exceeds
its capacity?

 Basically what I'm saying here, is there is no one true configuration,
> that offers the very best performance in all cases, if there was, we'd
> probably all be using GMP ;-)

Completely agree. But from advanced user perspective it's good to be able
to specify configuration (stack or heap allocation, limb size, etc). For
example it is possible to rewrite all the Voronoi code using GMP almost
without any performance lose. But this will require to split huge math
expressions onto atomic formulas, thus reducing code readability
and maintainability. That's the main reason I decided to switch to fixed

PS, the next step is to start stealing some more special case code from the
> SOC project.... I fully expect that to make your benchmarks run very
> slightly more slowly, but is necessary for other use cases...

I'd like to have a look at your backend implementation. It seems to be
already well done from your Voronoi benchmark. So it will be good to move
parts of that SOC projects to your current implementation. One of the
useful things we can think of is FFT multiplication (which is fast for
large numbers). For medium size numbers I would think of Karatsuba
multiplication. I would suggest generalized multiplication that will
combine those two plus usual quadratic multiplication for small numbers. I
can help on testing those and would ask Arseniy if he's interested on
migrating his FFT code. Does it sound good for you?


Unsubscribe & other changes:**

Boost list run by bdawes at, gregod at, cpdaniel at, john at