Boost logo

Boost :

Subject: Re: [boost] [xint] Boost.XInt formal review
From: Chad Nelson (chad.thecomfychair_at_[hidden])
Date: 2011-03-04 09:29:29


On Fri, 04 Mar 2011 01:18:53 -0800
"Jeffrey Lee Hellrung, Jr." <jhellrung_at_[hidden]> wrote:

>> It's rather late to make huge changes to the library at this point.
>
> I shouldn't think so. Do you feel this way because the library is
> currently under review?

I feel this way because I'm at the end of both my patience and my
freedom to devote months on end to something that isn't paying work. I
spent five months of twelve- to fifteen-hour days on it last year, first
writing it and then *rewriting* it six more times, trying to satisfy
the people on this list. When I asked for comments on the seventh
iteration, I got dead silence, which I took to mean that I'd
sufficiently satisfied everyone's desires. Obviously not.

If the library is accepted or conditionally accepted, I'll be happy to
continue improving it in my spare time, and probably even redesign it
based on the suggestions I've gotten. If it is outright rejected, it
dies. I'm not completely rewriting it yet again in a vain attempt to
satisfy people who will never be happy with it.

>>> Ummm...yes, the stated complexity is correct, but O(n(n+1)/2) =
>>> O(n^2).
>>>
>>> So asymptotically speaking, it's not any faster than using the naive
>>> grade-school multiplication algorithm. Like I said, unless you want
>>> to give a precise operation count, it's just O(n^2).
>>
>> Um... if you plug a concrete number into both equations, the answers
>> are quite different.
>
> Um... I'm not really sure how to respond to this. It's not a huge
> deal as far as the documentation is concerned, but if you *really*
> think that O(n^2) and O(n*(n+1)/2) are different guys, then I don't
> have a lot of confidence in the rest of your (as-yet-unchecked)
> complexity claims...

I have no formal training on big-O notation, only what knowledge I've
managed to pick up over the years of trying to interpret discussions of
programming algorithms. If half of roughly n^2 is somehow considered
equal to n^2 in big-O notation, then obviously that isn't sufficient.

I added the complexity estimates because someone on this list said that
the library had to have them. Without volunteering to do any of the
considerable work, I might point out. I tracked down the estimates from
real mathematicians when they were available; when they weren't, I had
to make educated guesses based on what I knew. Some of them will
inevitably be wrong, that's the nature of the beast.

>>> Let p have np bits. Each successive loop iteration *doubles* the
>>> number of bits in p, so the work for each successive iteration is
>>> np^2, (2np)^2, (4np)^2, ..., ((2^ne)*np)^2 (there are ne loop
>>> iterations, where ne is the number of bits in e).
>>
>> As far as I can tell, that's correct. ((2^ne)*np)^2 seems to describe
>> the total amount of work done.
>
> Well, in the context of above, it's only the amount of work done for
> the last iteration, but it's that last iteration that dominates.

Hm, yes, I see that now.

>>> http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Digit-by-digit_calculation
>>
>> I don't have time to dig into that in detail tonight, but it looks
>> very much like a binary search algorithm, very similar to what I'm
>> already using.
>
> In binary, no search needed! See the snippet of code.

I did, that's what reminded me of a binary search.

> I, too, don't have time at the moment to study it in detail, but I
> was at one time very familiar with it. And since it only requires
> bit-shifts and additions (no multiplications), I wouldn't be
> surprised if it were significantly faster than Newton iteration.

I've put it on the to-do list. If it's that much faster, I'll be happy
to implement it.

>>> Again, think about someone *reading* code that uses this is_prime
>>> function.
>>
>> He would expect that it reports whether the input is prime or not,
>> which is exactly the case.
>
> Well, no, it reports "whether the input is prime, with a vanishingly
> small chance that it is mistaken", which is different. To me.
> Sorry, as a mathematician by training, prime and probably prime are
> distinct properties and have very precise meanings. But I'm not a
> cryptographer, so maybe that's just the lingo.

I'm only an amateur cryptographer, but it's generally understood in the
field that it is impossible to prove that something is truly prime
without actually factoring it. In saying that a number "is prime,"
there's always an unspoken "with a high degree of certainty" attached.

>> With a name like
>> is_probably_prime_rabin_miller, he would probably feel obliged to
>> look up the Rabin-Miller primality test and figure out what
>> "probably" means in this context (at least, I would). I think the
>> shorter name is preferable for the reader, as it accurately conveys
>> the author's intent.
>
> Like I said, my crypto is pretty...well...non-existent... So this
> might be a dumb question, but am I to assume that whenever one wants
> to know that a number is prime, *probably* prime is perfectly okay?

Yes. It has to be, since there's no known way to factor numbers of the
size used in cryptography in anything approaching a reasonable amount of
time. Every SSL or SSH session you use employs such "probably prime"
numbers, as does every other application of public-key encryption.

>>> Besides, COW isn't on by default...
>>
>> It's used internally regardless, as I control everything within the
>> library and can ensure that it's used safely. The thread-safe option
>> only prevents it for integers returned to the user.
>
>Whoa, wait a second here. I can't help but see a red flag here, so
>color me incredulous and possibly very naive. How can copy-on-write
>help you internally where just moving the internal data isn't
>possible?
> Can you give an example of the internals where you find
> copy-on-write necessary?

Only the benchmarks that I came up with before, and referred you to
earlier in this thread. They may well change when I correct the
pass-by-value stuff.

>> I had a publicly-available make_unique function in an earlier
>> iteration of the library, and was criticized for it, so I removed
>> it. I don't remember if it was you personally doing the criticizing,
>> but you-the-Boost-community can't have it both ways. :-)
>
> Well this tagged copy constructor pretty much amounts to the same
> thing, doesn't it? Only less efficient, since it will actually
> perform a deep copy in situations where it's not needed, e.g., when
> the object is already in the thread you want it to be owned in.

Only if the person is foolish enough to explicitly tell the library to
do so. I'd like to think that any programmer using the library wouldn't
be that foolish.

> And more cryptic in client code. By the way, is there ever any
> reason to set the boolean force_thread_safety to false?

It should essentially *always* be false, that's why false is the
default value. Either thread-safety is already on for the type that
you're using (the default, by the way), in which case setting it to true
is redundant, or it's off for that type, in which case you must have
explicitly told the library that you're only using objects of its type
in a single thread and will assume responsibility for any changes to
that plan. That parameter exists solely so that, *if* you've told the
library that you're only using it in a single thread, you can override
that for specific copies of integer objects that must cross thread
boundaries.

-- 
Chad Nelson
Oak Circle Software, Inc.
*
*
*



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk