Boost logo

Boost :

Subject: Re: [boost] [xint] Boost.XInt formal review
From: Jeffrey Lee Hellrung, Jr. (jhellrung_at_[hidden])
Date: 2011-03-06 21:35:47


On 3/6/2011 4:25 PM, Chad Nelson wrote:
> On Sun, 06 Mar 2011 13:44:37 -0800
> "Jeffrey Lee Hellrung, Jr."<jhellrung_at_[hidden]> wrote:
[...]
> XInt's existing interface could be frozen at its current state with no
> problem. Even the removal of the copy-on-write stuff would only result
> in one function having an unused Boolean parameter. Any additional
> functionality can easily be added in a way that doesn't break existing
> client code.

The removal of the COW stuff would result in a huge interface change, as
it implies a particular usage of xint::integer_t objects.

I think whether you're willing to freeze the interface or not should
have little bearing on acceptance, in this case. Given the work that
still should be done (in my very subjective opinion), I don't think XInt
should be accepted as a boost library, and this has the added benefit
that you *don't* have to freeze the interface if and when you find
better alternatives.

>> - is_odd, is_even, sign, hex_digits, is_nan: These are all member
>> functions with no same-named free function...which makes me wonder
>> why these were chosen to be member functions while the likes of
>> getbit, setbit, is_prime, etc. were chosen to be strictly free
>> functions.
>
> Because n1692, the paper presented to the C++ Standards Committee in
> 2004 that I based XInt's interface on, has them there. I saw no point
> in cluttering the interface by adding free functions that just call
> those, although if that would satisfy this objection, it can easily be
> done.

I see. But you understand that their presence as member functions makes
one wonder about the asymmetry, right? I don't see why is_odd is any
more fundamental than getbit; indeed, getbit would seem to be more
fundamental than is_odd or is_even, no?

[...]
>> The Boost.Parameter interface to the xint::integer_t template is
>> fine, although Boost.Parameter *might* be overkill. I was trying to
>> find in the source code where the template parameters fed to the
>> xint::integer_t template are interpreted, but failed.
>
> No surprise, the code is pretty confusing. It's through integer_t's
> base classes, such as nan_functions, in detail/options.hpp.

Found it.

[...]
>> In any case, as far as I know (that is, Chad hasn't given me a
>> specific example), XInt doesn't use the parameter deduction feature of
>> Boost.Parameter,
>
> integer_t_data should clear that up. If I understood your explanation
> of what you meant, I don't think the class could accept an allocator in
> any arbitrary position without it, though I could well be wrong.

I see now. It doesn't look like you're using the keyword and tag
structure outlined here:

http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/index.html#parameter-enabled-class-templates

Could you explain why not? Using Boost.Parameter-style keywords and
tags might obviate the need for most of the parameter deduction
infrastructure you have at line 220 in options.hpp.

>> so one could consider using a Boost.MPL map to pack template
>> parameters together. This gives a slightly clunkier interface, but it
>> avoids the need for the user to set BOOST_PARAMETER_MAX_ARITY, so
>> there's definitely a tradeoff.
>
> One I would by far prefer not to make. Clunky interfaces are what I was
> trying to get away from with this library.

I only mentioned it as an option ;) I don't particularly prefer one or
the other.

>> The random number stuff and the cryptographic-specific stuff should
>> not be included within XInt. These should be left to other libraries
>> (Boost.Random in the former case).
>
> It is. Of the two random-number classes in the code, one is just a
> wrapper around a fast Boost.Random class that provides a default value
> for convenience, the other implements the same abilities as
> Boost.Random's random_device class in a form that's slightly easier to
> work with in the examples (Boost.Random couldn't handle that at all
> under Windows when I originally wrote it). Client code must specify a
> random number generator for the few (two, I think) functions that can
> use one, and they work equally well with Boost.Random's classes.

Are these localized to the examples, then, and hence not included by
default?

> I'm not wedded to the default_random_generator. It's no longer needed
> by any code in the library, and could be deleted without any other
> changes. But the strong_random_generator is significantly more
> convenient for the examples, and somewhat more convenient for casual
> client code use, as it doesn't require a compiled library to use as
> Boost.Random's does.

Perhaps you should suggest adding strong_random_generator to
Boost.Random library, then.

> The random number functions in the library are there to make it easier
> to generate large random numbers, which of course Boost.Random can't
> natively support. That's a legitimate mathematical purpose, one that
> would have to be reimplemented by a significant number of programs
> using the library if it were removed.

Okay, I understand. Do you basically just need to fill a range of
digits with random numbers? Perhaps this is something that Boost.Random
can make easier (although it isn't too hard at present).

>> I think the scope of a bigint library is very clearly to provide
>> arithmetic algorithms and data structures on which the algorithms
>> operate. This does not include cryptographic functionality.
>
> This might be pedantic, but there really *isn't* any cryptographic
> functionality in the library. There are only mathematical functions.
> is_prime (or its new name, is_probably_prime) is most useful in the
> field of cryptography, because huge prime numbers are of particular
> interest in that field, but prime numbers have a lot of uses outside of
> cryptography as well. is_prime is no more a cryptographic function than
> pow or gcd is.
>
> If I had to remove functions that are primarily interesting to
> cryptography, powmod would have to be on the list too, and it's of even
> more general use than is_prime.

Good point. For whatever reason, to me, is_prime just sticks out as a
function that's more appropriate in an example, given its specific use,
but consider others' comments as well.

[...]
>> Part (albeit a small part) of my reason for not accepting XInt is, by
>> Chad's own admission, that he wants the interface to settle down
>> before separating the algorithms out (if I'm misinterpreting your
>> comments, Chad, please correct), suggesting the interface hasn't been
>> finalized yet.
>
> I may have misspoken and said interface at some point, but my intention
> was to let the *internals* settle down before providing access to them.
> The justification is that opening up access to them now would tie my
> hands, because then I couldn't make changes to them without breaking
> client code, which is the last thing I ever want to do.

Okay, I may have misread or misremembered. Apologies.

[...]
>> The fixed-size integers leave something to be desired, but as Chad
>> himself has emphasized, they are second-class citizens. So-called
>> second-class citizens probably should not included within the
>> library, and if they are, not without huge warnings saying something
>> like "EXPERIMENTAL" or "SUBJECT TO CHANGE". [...]
>
> Eh? Only the internals of them are subject to change. The current
> interface can be preserved without any loss of functionality, and added
> onto without breaking anything.

I guess I kind of expected a breaking change if you added a
2's-complement-based implementation, which would seem reasonable. I
haven't looked in detail at the fixed integer stuff anyway, as I
gathered it wasn't a focus of the library. Maybe that inferred to me
that the interface was still in some amount of flux.

Perhaps the fixed integer stuff *should* actually be separated into a
different submodule altogether? I don't know, I seem to remember that's
how you had it originally, but there were arguments in favor of merging
everything into one integer_t class...?

>>> - What is your evaluation of the implementation?
>>
>> Chad already knows my feelings on COW, but to be explicit: I'm not
>> necessarily against use of COW in general, but I have yet to be given
>> a use case where COW would outperform move emulation in theory.
>
> I've located two: abs and negate. I believe those were the functions
> that originally prompted me to use it. I'm sorry that I didn't mention
> them before; I knew I had a reason for going to all the trouble of using
> it, but I didn't recall what that reason was until I saw those while
> working with the code today.
>
> Think about it. Both involve changes to the sign bit only, and one
> only sometimes needs to alter it. It would be foolish to deep-copy a
> ten-thousand-bit or larger number just to flip the sign bit. Even
> worse, not to do anything to it at all, in the case of abs and
> already-positive numbers.

Great example. Abs and negate can just return appropriate views layered
on top of the object at *less* cost than COW. So, yes, it would be
foolish to deep-copy if you're never going to change either the copy or
the original, but deep-copying in the absence of COW is far from necessary.

[...]
>> [...] It is disturbing to me that the COW infrastructure still exists
>> (and even more disturbing that it is still used internally!) when COW
>> is supposedly disabled.
>
> You make it sound like I was trying to hide it. The behavior is clearly
> explained on the "Copying and Argument-Passing" page of the
> documentation. It should probably be on the "threadsafe vs.
> copy_on_write" page too, but it didn't occur to me.

I obviously didn't read this closely enough, hence the surprise.

> The only concrete complaint that people could give me about CoW was
> that code using it couldn't be made thread-safe, which is true. So I
> made a way to instruct the library to force all integer objects to have
> unique storage before they're returned to user code, and made it the
> default to prevent unpleasant client-code surprises. The internal code
> is single-threaded, and completely under my control in any case if I
> find it useful to make it multithreaded someday, so that answered the
> thread-safety complaint.
>
> The only other complaints people had was that they didn't like it,
> which is not a valid reason to change it, and that move-emulation makes
> it irrelevant, which it doesn't in this case.

It's not thread-safe without atomic operations (maybe adds overhead in
single-threaded environments?); it adds space and time overhead
(probably negligible except on small integers); it complicates the
implementation (subjective); as far as I've seen, it is *still*
unnecessary, in the sense that it doesn't give any real performance
benefits; and it encourages a style of programming incompatible with the
standard containers (entirely subjective whether that's good or bad, I
admit).

Further, even if COW has some benefits over a non-COW implementation, it
should probably be built as another layer on top of a non-COW
implementation, as you can't go the other way around. COW shouldn't be
used at the lowest implementation level.

[...]
>> The "fast" claim on the primary documentation page is
>> unsubstantiated, and, according to Chad, somewhat unsubstantiatable
>> (assuming that's a word). I don't think any claims about performance
>> should be made *at all* without some kind of benchmark or statistics
>> to justify it. Any library will be fast on a sufficiently fast
>> (perhaps idealized) computer.
>
> As I've said before, I'm not claiming that it's "faster" or "fastest,"
> which are legitimately questionable claims. I'm saying that it's fast,
> as a way to conversationally lead in to the explanation that
> portability is more important, but reassure the reader that speed is
> still an important design criteria.

Simply put, it is a vacuous claim without further justification.

- Jeff


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk