Boost logo

Boost :

Subject: Re: [boost] [xint] Boost.XInt formal review
From: Jeffrey Lee Hellrung, Jr. (jhellrung_at_[hidden])
Date: 2011-03-05 16:41:16


On 3/4/2011 6:29 AM, Chad Nelson wrote:
> On Fri, 04 Mar 2011 01:18:53 -0800
> "Jeffrey Lee Hellrung, Jr."<jhellrung_at_[hidden]> wrote:
>
>>> It's rather late to make huge changes to the library at this point.
>>
>> I shouldn't think so. Do you feel this way because the library is
>> currently under review?
>
> I feel this way because I'm at the end of both my patience and my
> freedom to devote months on end to something that isn't paying work. I
> spent five months of twelve- to fifteen-hour days on it last year, first
> writing it and then *rewriting* it six more times, trying to satisfy
> the people on this list. When I asked for comments on the seventh
> iteration, I got dead silence, which I took to mean that I'd
> sufficiently satisfied everyone's desires. Obviously not.

I can probably speak for most everyone that we certainly appreciate the
time you've put into this to get an actual, concrete bigint proposal
that we can have a serious dialogue about. At the very least, it's
brought out what others hope to see in a bigint library.

Regarding the feedback from previous versions, you should understand
that two major (to me, at least) suggestions from previous versions were:
- Eliminate COW entirely, or, at the very least, don't make it
mandatory. Add it on top of an existing, public infrastructure, if you
really want. And if it is used, I think it would be most appropriate
for an immutable data structure. As far as I can gather from your past
comments, COW is still being used internally. Yes, this is largely an
implementation concern, but I think a valid concern nonetheless.
- Separate the arithmetic algorithms from the data structures, defining
a BigInt concept or group of concepts in the process.
Regarding the former, it appears we still don't have a definitive
benchmark comparing COW with move emulation, and one would be nice, but
we can still predict how many deep copies each strategy will require for
a given expression, and except in constrained situations where copies
are never modified, I don't think COW gives any benefit. Sometimes COW
is worse, since you can't explicitly transfer ownership. Feel free to
prove me wrong. If you can give an example we can analyze where COW
performs fewer copies, it would help your cause. For example:

integer x = ..., integer a = ..., integer b = ...;
x = a + b;

In both cases, there should should be a single allocation to hold the
temporary sum a + b, which subsequently gets transferred to x.

Regarding the lack of feedback on your *seventh* iteration, it's
unfortunate, but I think, throughout your development process, that it
was reasonable to expect:
- that not everyone is going to inspect every iteration of your library; and
- that you will get (by far?) the most of amount of feedback
(including critical, almost negative, feedback) on the version you
submit for inclusion in boost.

> If the library is accepted or conditionally accepted, I'll be happy to
> continue improving it in my spare time, and probably even redesign it
> based on the suggestions I've gotten. If it is outright rejected, it
> dies. I'm not completely rewriting it yet again in a vain attempt to
> satisfy people who will never be happy with it.

Rewriting the library for the primary purpose of satisfying the boost
development list is the wrong perspective to take. If you don't view
the majority of the suggested modifications as positive directions to
take your library in, then maybe boost is the wrong destination for xint.

For what it's worth, I will vote to not accept, but I will try to
summarize my reasons for doing so in a separate email.

>>>> Ummm...yes, the stated complexity is correct, but O(n(n+1)/2) =
>>>> O(n^2).
>>>>
>>>> So asymptotically speaking, it's not any faster than using the naive
>>>> grade-school multiplication algorithm. Like I said, unless you want
>>>> to give a precise operation count, it's just O(n^2).
>>>
>>> Um... if you plug a concrete number into both equations, the answers
>>> are quite different.
>>
>> Um... I'm not really sure how to respond to this. It's not a huge
>> deal as far as the documentation is concerned, but if you *really*
>> think that O(n^2) and O(n*(n+1)/2) are different guys, then I don't
>> have a lot of confidence in the rest of your (as-yet-unchecked)
>> complexity claims...
>
> I have no formal training on big-O notation, only what knowledge I've
> managed to pick up over the years of trying to interpret discussions of
> programming algorithms. If half of roughly n^2 is somehow considered
> equal to n^2 in big-O notation, then obviously that isn't sufficient.

I think a quick trip to wikipedia, as suggested by Robert, should
convince you. Complexity theory is certainly a valuable tool for any
programmer to have, so I'd suggest getting up to speed as soon as
convenient. wikipedia articles should be sufficient.

> I added the complexity estimates because someone on this list said that
> the library had to have them. Without volunteering to do any of the
> considerable work, I might point out. I tracked down the estimates from
> real mathematicians when they were available; when they weren't, I had
> to make educated guesses based on what I knew. Some of them will
> inevitably be wrong, that's the nature of the beast.

Fair enough. Like I said, I suggest familiarizing yourself with the
concepts then.

[...]
>>>> Besides, COW isn't on by default...
>>>
>>> It's used internally regardless, as I control everything within the
>>> library and can ensure that it's used safely. The thread-safe option
>>> only prevents it for integers returned to the user.
>>
>> Whoa, wait a second here. I can't help but see a red flag here, so
>> color me incredulous and possibly very naive. How can copy-on-write
>> help you internally where just moving the internal data isn't
>> possible?
>> Can you give an example of the internals where you find
>> copy-on-write necessary?
>
> Only the benchmarks that I came up with before, and referred you to
> earlier in this thread. They may well change when I correct the
> pass-by-value stuff.

Sorry if this wasn't clear, so let me rephrase. Which implementation
function or block of code necessitates using COW to eliminate
unnecessary copying? This doesn't require benchmarking, only code analysis.

>>> I had a publicly-available make_unique function in an earlier
>>> iteration of the library, and was criticized for it, so I removed
>>> it. I don't remember if it was you personally doing the criticizing,
>>> but you-the-Boost-community can't have it both ways. :-)
>>
>> Well this tagged copy constructor pretty much amounts to the same
>> thing, doesn't it? Only less efficient, since it will actually
>> perform a deep copy in situations where it's not needed, e.g., when
>> the object is already in the thread you want it to be owned in.
>
> Only if the person is foolish enough to explicitly tell the library to
> do so. I'd like to think that any programmer using the library wouldn't
> be that foolish.

You didn't really address my primary comment. You can put lipstick on a
pig, but it's still a pig. You've just renamed make_unique into a
constructor. Is that a fair analysis?

>> And more cryptic in client code. By the way, is there ever any
>> reason to set the boolean force_thread_safety to false?
>
> It should essentially *always* be false, that's why false is the
> default value. Either thread-safety is already on for the type that
> you're using (the default, by the way), in which case setting it to true
> is redundant, or it's off for that type, in which case you must have
> explicitly told the library that you're only using objects of its type
> in a single thread and will assume responsibility for any changes to
> that plan. That parameter exists solely so that, *if* you've told the
> library that you're only using it in a single thread, you can override
> that for specific copies of integer objects that must cross thread
> boundaries.

Sorry, is there any reason to set the boolean force_thread_safety
*explicitly* to false? I suspect the answer is "no", so I would
consider this an inferior design to just have the regular copy
constructor augmented with another constructor that takes a special
empty tag struct whose only purpose to ensure the copy has unique
storage. But ultimately it's a moot point as it's just a hack to
obscure the make_unique member function into something else, and
make_unique would be preferable still. And even *that* is somewhat of a
moot point as I don't think COW should be used for the base integer
class in the first place :/

- Jeff


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk