Boost logo

Boost :

Subject: [boost] [Review] XInt Review
From: Joel Falcou (joel.falcou_at_[hidden])
Date: 2011-03-04 04:49:25


Here is my review for XInt. Currently I think the library should not be
included in boost as a lot of problems are, in my opinion, still to be
fixed and the design has to be rethought.

***********************************************************************
* What is your evaluation of the design?

Large integer is a bit like XML and Unicode. Lots of people want to do
such a library because it has a high impact of usefulness and, from
afar, it doesn't sounds *this* complicated.

Alas, it is :/

The premises are OK. We have some large integer type that has proper
value semantic and can be used in a variety of operations and functions.
So far, so good. The main culprit is that the current library design is
rather hardwired to work with a choice of implementation the author
thought to be OK at some point but which still have to be proven better
than existing solution.

The overall design lacks separation between the data and the algorithm
that makes the library difficult to extend: what if i want to take an
arbitrary range of integers and look at them as a huge series of bits ?
What if i don't want the extra memory overhead so i can mmap some files
directly into memory (cause reading a 10k bits integer is a no-no) ?
Can I pass the integer type to a STD algorithm if I want to iterate over
the bytes it contains ? or copy them using boost::range copy ?

On the data side, i think being forced to use a given representation is
wrong. The Ideal should be that the library provide a fall back
implementation but provide adaptor to adapt any kind of data to be
looked at as a large integer. The definition of a LargeInteger concept
would help defining such an adapting interface and make the library more
fluid. For example, what if i want to implement LCM ?

* What is your evaluation of the implementation?

The implementation is poor for various reasons:

  - author uses COW for performance reasons. Alas COW has been
demonstrated again and again to be of little interest. Looking at the
problem 10s (types holding arbitrary large amount of data to be chained
in operations without copy) should have lead to an expression template
based solution. Capture a whole chain of operations on values, compute
the worst case nbr of bits to store, allocate once then compute inside.
It prevent spurious allocation and make the algorithms easier to write.
Additionally, ET would help detecting expression pattern for which a non
naive algorithm could be applied.

  - virtual inheritance: looking at integer_t shows:

    class integer_t: virtual public ....

   Why ? It seems you're trying to implement some policy based design
but i cant see why virtual inheritance is used ? Do you get problem not
having multiple definition in your mixins ?

  - There is a lot of unwarranted pass-by-value function prototype which
seriously impact performance even with COW. pass-by-ref > pass-by-value
+ COW. Moreover, the "thread safety" COW is dubious.

  - the headers are obscenely big. It could be easier to follow the code
it there was not this 2k+ line mammoth files.

* What is your evaluation of the documentation?

The documentation is OK but it is rather hard to navigate as there is
real links between sections, forcing me to go back and forth between
sections.

* What is your evaluation of the potential usefulness of the library?

The problem the library tries to solve is a very important one as shown
by the large amount of similar project. Having such a library is worthy
of inclusion in boost but not in the present form and shape.

* Did you try to use the library? With what compiler? Did you have any
problems?

I tried some examples from the doc on gcc 4.5 with no real problems.

* How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?

* Are you knowledgeable about the problem domain?

I've been doing research and development on scientific computing for 9
years. If the specific problem of large integer is a bit of a fringe
domain for me, I have been collaborating on such design in some projects.

* Do you think the library should be accepted as a Boost library?

No.

In its current form, the library *should not* be accepted into Boost.
performances claims are not backed up (and other reviewers show it can
be far better0 the design lacks of genericity and its implementation is
, in my opinion, sub-optimal. To be accepted the following point should
be addressed:

   - use expression template to handle efficient combination of
operations and potentially detect optimisable pattern. Proto is a
shoe-in for this.

   - the library entities should be made more generic by providing an
extended integer adaptor facility that make one able to substitute a
custom integer representation (like having extended_integer<GMP>) for
corner case stuff. This adaptor will provide an interface able to turn
these types into a type fulfilling the underlying Large Integer concept
the library need to handle. Being compatible with GMP should even be a
goal and a selling point for the library.

  - do real benchmarks and compare to existing state of the art
solution, at least show benchmarks against GMP on realistic use cases.

  - the algorithm should be more generic and not be tied to the concrete
type of large integer used.

Frankly, licensing issues aside, a proto based library able to wrap GMP
if needed and with Range based interface could already cover 80% of
needs and provides correct speed. Then building on top of that to add
new algorithm and new back end representation and adaptors would fill
the library in a very smooth way.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk