
BoostCommit : 
From: arseny.kapoulkine_at_[hidden]
Date: 20070819 14:24:20
Author: zeux
Date: 20070819 14:24:19 EDT (Sun, 19 Aug 2007)
New Revision: 38765
URL: http://svn.boost.org/trac/boost/changeset/38765
Log:
Documentation and todo list fixes
Text files modified:
sandbox/SOC/2007/bigint/libs/bigint/index.html  35 ++++++++++
sandbox/SOC/2007/bigint/libs/bigint/todo.txt  84 +++++++++++++++++++++++++++++++++
2 files changed, 91 insertions(+), 28 deletions()
Modified: sandbox/SOC/2007/bigint/libs/bigint/index.html
==============================================================================
 sandbox/SOC/2007/bigint/libs/bigint/index.html (original)
+++ sandbox/SOC/2007/bigint/libs/bigint/index.html 20070819 14:24:19 EDT (Sun, 19 Aug 2007)
@@ 201,8 +201,6 @@
bigint_special a = 1, b = a * (a  4);
</pre>
<font color="#ff0000">As this can change and bigint_base can be parametrized by both implementation and storage class
in future, this section has to be revised before the project ends</font>
</p>
<h3><a name="Constructors">Constructors</a></h3>
@@ 225,7 +223,7 @@
boost::bigint d = c / 49;
</pre>
<font color="#ff0000"><b>bigint</b> relies on the presence of 64bit integer support, but this may change in the future</font>
+<b>bigint</b> relies on the presence of 64bit integer support, but this may change in the future.
</p>
<p>String constructors initialize <b>bigint</b> from string in the certain format. Strings can be wide, and you can pass either
@@ 243,8 +241,8 @@
initialize <b>bigint</b> instance to 3, string "+3" will initialize it to 3).</p>
<p>'digits' is a group of digits belonging to the character set of the base that is passed to the constructor; base is a number
that is greater or equal to 2 and less or equal to 36; bases outside of allowed range lead to undefined results <font color="#ff0000">
perhaps we should change that and initialize number to 0 if base is < 2 or > 36?</font>. For bases less or equal than 10,
+that is greater or equal to 2 and less or equal to 36; specifying base outside of allowed range leads to initializing bigint to
+zero. For bases less or equal than 10,
the character set of digits from 0 to (base1) is used. For bases greater than 10, the character set consists of digits from 0
to 9, and then the letters from 'A' (ASCII character 'A') to the last needed letter for the set to reach the needed size. Letters are not case sensitive.
Letter 'A' corresponds to digit 10, letter 'B' corresponds to digit 11 and so on. Letter 'Z' (if applicable  this is an allowed
@@ 453,11 +451,11 @@
<table><tr><th>Function(s)</th><th>'gmp' implementation</th><th>'default' implementation</th></tr>
<tr><td>default ctor</td><td>O(1)</td><td>O(1)</td></tr>
<tr><td>integer ctors</td><td>O(1)</td><td>O(1)</td></tr>
<tr><td>string ctors</td><td>O(A1) <font color="#ff0000">check</font></td><td>O(A1<sup>2</sup>)</td></tr>
+<tr><td>string ctors</td><td>O(A1<sup>2</sup>) for small A1, subquadratic algorithm for large A1</td><td>O(A1<sup>2</sup>)</td></tr>
<tr><td>+, , +=, =</td><td>O(M)</td><td>O(M)</td></tr>
<tr><td>*, *=</td><td><= O(M<sup>2</sup>) <font color="#ff0000">provide more precise info</font></td><td>O(M<sup>2</sup>)</td></tr>
<tr><td>/, /=</td><td><= O(M<sup>2</sup>) <font color="#ff0000">provide more precise info</font></td><td>O(M<sup>2</sup>)</td></tr>
<tr><td>%, %=</td><td><= O(M<sup>2</sup>) <font color="#ff0000">provide more precise info</font></td><td>O(M<sup>2</sup>)</td></tr>
+<tr><td>*, *=</td><td>O(M<sup>2</sup>), O(M<sup>1.585</sup>), O(M<sup>1.465</sup>) or O(M<sup>k / (k1)</sup>)<br>(k<sub>max</sub>=10, complexity is O(M<sup>1.111</sup>) for very large M), depending on M</td><td>O(M<sup>2</sup>) for small M, O(M*logM) for large M</td></tr>
+<tr><td>/, /=</td><td>O(M<sup>2</sup>) for small M, O(M<sup>k / (k1)</sup>*logM) (k<sub>max</sub>=10) for large M</td><td>O(M<sup>2</sup>)</td></tr>
+<tr><td>%, %=</td><td>O(M<sup>2</sup>) for small M, O(M<sup>k / (k1)</sup>*logM) (k<sub>max</sub>=10) for large M</td><td>O(M<sup>2</sup>)</td></tr>
<tr><td>&, &=, , =, ^, ^=</td><td>O(M)</td><td>O(M)</td></tr>
<tr><td><<, >></td><td>O(max(A1, 2<sup>A2</sup>)) <font color="#808080"> NB. O(2<sup>A2</sup>) is equal to O(shift amount)</font></td><td>O(max(A1, 2<sup>A2</sup>))</td></tr>
<tr><td><<=, >>=</td><td>O(max(N, 2<sup>A1</sup>)) <font color="#808080"> NB. O(2<sup>A1</sup>) is equal to O(shift amount)</font></td><td>O(max(N, 2<sup>A1</sup>))</td></tr>
@@ 465,14 +463,12 @@
<tr><td>+, </td><td>O(1)</td><td>O(1)</td></tr>
<tr><td>~</td><td>O(N)</td><td>O(N)</td></tr>
<tr><td>bool conversions</td><td>O(1)</td><td>O(1)</td></tr>
<tr><td>string conversions</td><td>O(N) <font color="#ff0000">check</font></td><td>O(N<sup>2</sup>)</td></tr>
+<tr><td>string conversions</td><td>O(N<sup>2</sup>) for small N, subquadratic algorithm for large N</td><td>O(N<sup>2</sup>)</td></tr>
<tr><td>can_convert_to</td><td>O(1)</font></td><td>O(1)</td></tr>
<tr><td>to_number</td><td>O(N)</font></td><td>O(N)</td></tr>
<tr><td>comparison operators</td><td>O(M)</td><td>O(M)</td></tr>
<tr><td>abs</td><td>O(1)</font></td><td>O(1)</td></tr>
<tr><td>pow</td><td>O(A1 * A2) <font color="#ff0000">check</font></td><td>???</td></tr>
<tr><td>div</td><td><= O(M<sup>2</sup>) <font color="#ff0000">provide more precise info</font></td><td>O(M<sup>2</sup>)</tr>
<tr><td>sqrt</td><td><= O(M<sup>2</sup>) <font color="#ff0000">provide more precise info</font></td><td>???</td></tr>
+<tr><td>div</td><td>O(M<sup>2</sup>) for small M, O(M<sup>k / (k1)</sup>*logM) (k<sub>max</sub>=10) for large M</td><td>O(M<sup>2</sup>)</td></tr>
</table>
<h2><a name="Exceptions">Exceptions</a></h2>
@@ 482,6 +478,17 @@
destructor) may throw a <code>std::bad_alloc</code> exception. There are no other possible
exceptions.</p>
+<p>Note: the GMPbased implementation has a shortcoming  GMP does not have a proper error handling strategy for memory
+allocations; therefore, any failing allocation aborts the program execution. There is no proper solution for this problem
+(available solutions either mean rewriting huge parts of GMP, which is hard, uses the knowledge of the GMP implementation
+details, that may change in future, and does not completely solve the problem (because of temporary allocations in mpn),
+or writing some complex allocator with rollbacking abilities, which means writing not thread safe code, some performance
+overhead, lots of changes to existing codebase, etc.), so the decision has been made to leave it as is and just document this
+shortcoming  hopefully, either the performance characteristics of other implementations will soon asymptotically match GMPs
+ones (so that the performance will differ only by a constant factor), or appropriate error handling will be introduced to GMP
+(this is in a list of "projects suitable for volunteers" (see projects.html in GMP documentation)).
+
+
<h2><a name="References">References</a></h2>
<ul>
<li>The bigint header itself: bigint.hpp</li>
@@ 509,7 +516,7 @@
<p>I shamelessly stole some data for unit tests from Big Integer Libary by Richard Peters.</p>
<p>Revised June 16, 2007</p>
+<p>Revised August 18, 2007</p>
<p>© Copyright Arseny Kapoulkine 2007. Permission to
copy, use, modify, sell and distribute this document is granted provided this
Modified: sandbox/SOC/2007/bigint/libs/bigint/todo.txt
==============================================================================
 sandbox/SOC/2007/bigint/libs/bigint/todo.txt (original)
+++ sandbox/SOC/2007/bigint/libs/bigint/todo.txt 20070819 14:24:19 EDT (Sun, 19 Aug 2007)
@@ 13,10 +13,10 @@
1. Bigint interface (header) 21 May Awaiting resolution
2. GMP implementation 21 May Awaiting resolution
3. Correctness tests 7 June Awaiting resolution
4. Documentation 21 June In progress
5. Performance tests 7 July In progress
6. Interface for storage 14 July In progress
7. Storage implementation 14 July In progress
+4. Documentation 21 June Awaiting resolution
+5. Performance tests 7 July Awaiting resolution
+6. Interface for storage 14 July Awaiting resolution
+7. Storage implementation 14 July Awaiting resolution
8. Default implementation 30 July Awaiting resolution
9. Advanced algorithms N/A In progress
@@ 72,6 +72,12 @@
+ stream input support with all needed modifiers and other things
Status: implemented
++ fix const correctness issues
+Status: fixed
+
++ fix == 0 and != 0 issue
+Status: fixed
+
2. GMP implementation
+ conversion from string with different bases (236, use 26 letters + 10 digits)
@@ 174,13 +180,38 @@
4. Documentation
 do we have to change the current behaviour for string conversions for bases outside [2, 36]?
Status: needs investigating & implementing
++ do we have to change the current behaviour for string conversions for bases outside [2, 36]?
+Status: now returns zero
 provide accurate information for 'Performance' table
++ provide accurate information for 'Performance' table
+Status: implemented
+
++ document the shortcoming of GMP implementation related to memory allocation failure
+Status: implemented
5. Performance tests
++ establish a framework (mask for tests, mask for implementation, output types (csv, html table), list all tests/implementations
+Status: implemented
+
++ basic arithmetics tests (+, )  small and large operands
+Status: implemented
+
++ basic arithmetics tests (*, /, %)  small, medium and large operands
+Status: implemented (except for large operands for / and %  current default implementation is too slow)
+
++ basic arithmetics tests (bitwise ops, shifts)  small and large operands
+Status: implemented
+
++ sqrt test
+Status: implemented
+
++ string conversion tests
+Status: implemented
+
++ complex tests (gcd, fibonacci, isprime, something else?)
+Status: implemented
+
6. Interface for storage
+ do we need push_back?
@@ 189,16 +220,16 @@
+ refactor
Status: implemented
 perhaps move bool negative to storage?
Status: needs investigation
++ perhaps move bool negative to storage?
+Status: no
7. Storage implementation
+ implement stack based fixed capacity storage; add corresponding tests
Status: implemented
 implement all functionality for vectorbased interface as set by interface requirements
Status: needs implementing
++ implement all functionality for vectorbased interface as set by interface requirements
+Status: done
8. Default implementation
@@ 256,10 +287,16 @@
+ optimize string conversion
Status: implemented; 6x speedup for fibonacci(10^6)
++ optimize bitshifts in case of exact shift count (shift count is divisible by limb bit count)
+Status: implemented
+
9. Advanced algorithms
 check modmul with either existing test or some other way
Status: needs checking
++ check modmul with either existing test or some other way
+Status: checked & fixed
+
++ choose either FFT or basecase method, depending on the input size
+Status: implemented
 perhaps support FFT with 3 primes for base = 2^32?
Status: needs investigation
@@ 286,11 +323,30 @@
28.07.07. Further research showed that limb allocation in system layer (mpn) occurs for multiplication (Toom3 and FFT),
string <> number conversions, division, sqrt, and perhaps somewhere else :(
+18.08.07. This is obviously a shortcoming of GMP (and this is documented in GMP manual). It is quite hard to solve correctly 
+the only viable way is to make own allocation functions that store all allocations in some container, and have a "rollback"
+function which purges all allocations; then all gmp implementation functions have to be wrapped with a code like this:
+
+allocator.init_storing_mode(); // here we set storing mode to on
+
+if (setjmp(allocator.jmpbuf))
+{
+ allocator.purge();
+}
+else
+{
+ ... (do some stuff, allocator has longjmp if allocation fails)
+}
+
+allocator.reset_storing_mode(); // here we clear array with pointers and set storing mode to off
+
+This is not threadsafe, has performance overhead, has a lot of code overhead, and is generally a PITA  therefore the decision
+is made to not do anything about the problem, and just document it as a shortcoming of GMP implementation.
+

Extra features:
 expression templates for base class (should be done so that this does not affect any implementation)
 advanced algorithms for default implementation
 make GMP work with arbitrary storage strategy (and perhaps enforce supporting arbitrary strategy in future implementations)
 getting several bits from an integer (say, 7 bits from 4859th one). Perhaps, assigning several bits.
 conversion from/to floating point (float, double, long double)
BoostCommit list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk