Boost logo

Boost :

Subject: Re: [boost] [xint] Boost.XInt formal review
From: Jeffrey Lee Hellrung, Jr. (jhellrung_at_[hidden])
Date: 2011-03-02 16:46:33


On 3/2/2011 2:58 AM, Vladimir Prus wrote:
> Boosters,
>
> the formal review of the Boost.XInt library, by Chad Nelson, starts now and will
> run through March 11.
>
> The documentation for the current version is available at:
>
> http://www.oakcircle.com/xint_docs/
>
> The source code is available via Subversion at:
>
> http://svn.boost.org/svn/boost/sandbox/xint
[...]

This (like Mathias) is not a review, but rather a random list of
comments upon a first pass through the documentation (and a little
through the implementation, but only because the documentation proved
insufficient to answer some questions).

I have two general and major comments.

1) I was hoping the bignum algorithms would be factored out and allowed
to operate on, e.g., ranges of digits, rather than being coupled to the
xint::integer_t data structure. I know this had been mentioned before.
  Chad, what are your thoughts on this? I know the purpose of this
library is to provide out-of-the-box bignum functionality with minimal
hassle, but, to me, the algorithms should be part of the interface as
well and decoupled from the xint::integer_t data structure.

2) In a similar vein, I would like to see the cryptographic
functionality decoupled from the xint::integer_t data structure as well,
and released separately.

Also, good things: I like the policy-based design, and I think it's the
right direction. The Boost.Parameter arity thing may get onerous (see
comment below), and one thing you could do to work around it is to just
accept a mpl::map of tags, since I'm guessing your not using
Boost.Parameter's deduction functionality (awesome and powerful, but
unnecessary in this case, I think).

On to more specific comments...

Documentation Comments
----------------------

* http://www.oakcircle.com/xint_docs/

"Why would I use it? ... Because it's fast."

Do you have any benchmarks to back this up? Fast relative to...?

"Why would I use it? ... Because it has the features you need."

Included in here is "[c]ryptographically-secure prime number
generation". This seems a little too domain-specific to be included in
this library. Is there any reason the prime number generation couldn't
be factored out and, further, be bignum-implementation agnostic?
Boost.Prime, for example?

* http://www.oakcircle.com/xint_docs/namespaceboost_1_1xint.html

integer_t<...> square (const integer_t<...> n)

The complexity is just O(n^2). If you want to more precise, don't use
O(...), just state the number of additions/subtractions and multiplications.

integer_t<...> pow (const integer_t<...> n, const integer_t<...> e)

I suspect this complexity is, really, O( # of bits in the result ), as
the last one or two multiplies are what dominate.

I was also thinking about adding an overload where the power is
specified as an unsigned long, but I don't know if it would actually be
worth it...

integer_t<...> square_root (const integer_t<...> n)

You should mention that the algorithm is just Newton iteration with
initial guess 2^floor(log2(n)/2) (where n is the input value). Also,
your complexity claim implies a bounded number of Newton iterations,
independent of the inputs, as each loop through the while loop is O(n^2)
(where n is the # of bits) due to the division...that may be the case,
and if so, you might want to back that up with a reference or something,
but I suspect you should have maybe a log(n) in there (where n is the #
of bits), as Newton has "only" quadratic convergence.

I believe there is an O(n) square_root algorithm that would be
asymptotically faster, but...I'd have to look it up...

int is_prime (const integer_t<...> n, callback_t callback=no_callback)

Given the documentation, I think a more descriptive name, such as
is_probably_prime_rabin_miller, would be better. Also, I'm not sure how
small "infinitesimally" small is...it seems the wikipedia page only
guarantees at most a 1/4^5 chance of a false positive (given the current
implementation), but in practice it states the chance is much much less.
  Perhaps you should allow the iteration count (the "k" from the
wikipedia article) to be passed as a parameter as well (possibly
defaulted), for those that want more control on their primality test.

But, similar to a comment above, I'm not sure any cryptographic or
number theoretic features outside of basic bignum arithmetic and
operations should be included in this library; I would like to see them
factored out and independent of xint.

*
http://www.oakcircle.com/xint_docs/structboost_1_1xint_1_1options_1_1threadsafe.html

"Note that even with this option, it is still your responsibility to
ensure that only one thread accesses a particular integer object at a time."

You seem to imply that it is not thread safe under any circumstances for
multiple threads to even have *read* access to an xint::integer_t
object. Is that the case, and if so, why?

* http://www.oakcircle.com/xint_docs/classboost_1_1xint_1_1integer__t.html

template<...> class boost::xint::integer_t<>

"You can specify several template parameters for the type. Most of these
parameters must be specified via helper classes defined in the
boost::xint::options namespace. See the list and descriptions there."

So that covers "most" of the template parameters...what about the rest?
Also, it would be helpful to document here what the default settings are.

integer_t<...> & operator+= (const integer_t<...> b)
integer_t<...> & operator-= (const integer_t<...> b)

Why are these parameters passed by value?

int sign (bool signed_zero=false) const

I think you should separate the 2 different meanings of signedness into
2 separate functions. There are performance considerations, of course,
but it's also difficult to the casual reader to infer what x.sign(true)
is suppose to mean without looking it up. A second function called
sign_with_signed_zero might be more appropriate.

* http://www.oakcircle.com/xint_docs/threadsafe.html

"When using copy_on_write, identical integer objects are allowed to
share storage, which is more efficient for both CPU power and memory..."

I'm (still) not sold that copy-on-write gives any noticeable performance
benefits in this library over move semantics except in contrived
examples, so I think some kind of justification of the above claim would
be justified.

I'm also not a fan of the integer_t::integer_t(const integer_t& b, bool
force_thread_safety) constructor; isn't there already an "ensure_unique"
member function already there (in the implementation) that ensures an
integer_t object has sole ownership of its storage? Indeed, looking
more at some source files, I would guess this is _make_unique. I would
consider such a member function preferable to this constructor overload,
especially given how the mysterious boolean parameter looks in client
code (see integer_t::sign comment above).

* http://www.oakcircle.com/xint_docs/basic__types__and__includes_8hpp.html

"#define BOOST_PARAMETER_MAX_ARITY 6"

This is not guaranteed to actually have any effect. Think about if
Boost.Parameter gets included before any xint header does. Better (and,
yes, more onerous) to document that xint::integer_t uses Boost.Parameter
and hence the user must set this macro to at least 6.

Miscellaneous Comments
----------------------

Why is digit_t in the xint::detail namespace, when it is expected to be
part of the interface for purposes of supplying an allocator?

How did you create your documentation? Your documentation generator may
appreciate an acknowledgment in your documentation.

Also, regarding the complexity notation, it would probably be best to
state the complexity in terms of some other letter than n, so as not to
confuse n with the input value. I know I did *not* do this in my
comments above, as I was trying to follow your notation, but it easily
gets confusing. Or just use log(n) instead of n to mean the # of bits
or digits.

That's it for now,

- Jeff


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk