Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2005-11-19 20:35:21


"Robert Ramey" <ramey_at_[hidden]> writes:

>>> I suspect that the job of making a protable binary archive is much
>>> harder than it first appears.
>>
>> Actually it's almost trivial (I did it over 10 years ago), but I don't
>> know what that has to do with what we're trying to accomplish.
>
>> The speedups we're proposing don't have anything in particular to do
>> with portable binary archives.
>
> I presumed too much then. From the thread discussion, it seemed
> that this was just the intial effort to adapt the serialization
> library to the needs of High Performance Computing.

That is the immediate application and the initial motivation for the
proposal. However, there is a general class of problems not specific
to what is usually thought of as HPC (tautological definitions of HPC
aside) for which the proposed enhancements can provide dramatic
speedups.

> XDR compatibility. (http://www.faqs.org/rfcs/rfc1014.html) was
> mentioned at some point as was MPI
> (http://www-unix.mcs.anl.gov/mpi/mpi-standard/mpi-report-1.1/node39.htm#Node39)(I
> think) Both of these entail portable binary format - with atendant
> endian issues.

Those are indeed important applications for the proposed enhancements.

> Maybe the mentioning of this in the context of
> discussion of the submission which didn't really mention this
> confused things in my own mind.

Perhaps.

>>> I didn't pursue this as I really don't want to discourage these
>>> kinds of efforts and they are (or should be) orhogonal to the
>>> library as it is currently implemented.. If they can be implemented
>>> without altering the core - then I have no problem. If someone
>>> believes that modifying the core is unavoidable, then either he or I
>>> have made some sort of mistake and it will have to be resolved.
>
>> It's not unavoidable; as I've said before, it just has consequences
>> that we don't like, and we think you probably won't like either. If
>> you can hang on until we've presented what we think is the best design
>> that avoids altering the core, then we can look at the consequences.
>> Once you understand them, if you still don't want to make any changes
>> and you're willing to accept the consequences, we're not going to
>> press the issue any further.
>
> Fine, I was asked to comment on what was submitted. We'll start the
> next round with a clean slate.

Great.

>>> If they don't reallly have to alter the core, but the archive auther
>>> thinks it would make his job easier - then we have a probem.
>>
>> Let me be very clear about this, at least:
>>
>> ,----
>> | Ease of archive implementation is unrelated to the motivation for
>> | requesting core changes.
>> `----
>>
>> I hope that allays at least one of your concerns.
>
> It does. And I'm sure you probably deal with this on a regular basis
> with your own libraries.

Actually, I can't remember a time when the convenience of my users or
extenders came into conflict with the conceptual integrity or
maintainability of my library. Maybe it's just a mindset thing, or
maybe it's something about the problem space you're addressing.
Regardless, I understand and sympathize with your position.

>>> I get a suggestion about once a month to modify the core of he
>>> library for this or that reason. Aside from bugs, it usually boils
>>> down to the suggestor looking at the code and seeing - "Oh I could
>>> fix this right there!" without considering all the repercussions and
>>> without considering the alternatives. (As you might guess, this is
>>> what I believe happened in this case).
>
> Note that this isn't a personal criticism - its a natural occurance that
> happens all the time.

It's not insulting; it just happens to be mistaken. Matthias tried
hard to avoid modifying the library, but these consequences I keep
referring to can't be avoided any other way. Regardless, I hope you'll be
able to suspend judgement until we get to that part of the discussion.

>> Actually Matthias' considerations went much deeper than you give him
>> credit for. In my opinion, he just failed to communicate his
>> rationale properly, and since the details of his code seemed to you to
>> violate basic principles of your design, I'm sure it was all the more
>> difficult for you to understand the problems he is trying to avoid.
>
> LOL - I think I understood the code submitted and what it was
> intended to achive.

Of course you do. However, there's no sign that you understand the
reasons for core changes. When Matthias asked the question that was
aimed at highlighting those reasons, you stopped replying, and even
though you've restarted the thread, you still haven't answered it.
That said, at this point I think you should hold off until after I've
methodically built the context for the question.

> As far as I could fathom the rationale, I presented an alternative
> designed to achieve the same results without sprinking bits of code
> throughout lots of other modules.

Yes, as I've been saying, without core changes, fast array
serialization is achievable, but not without significant costs that I
don't think you've considered. At least, I haven't seen any evidence
that you have.

>> Working from new code that (I hope!) won't cause you any alarm, it
>> might be easier to understand the rationale.
>
> I guess you and Matthias were somewhat taken aback by my
> response. Sorry about that. Anyway, it seems you do have
> an understanding and even appreciation of my concerns so
> I'm optimistic that the next iteration will be better.

I'm pretty sure you'll be more comfortable. It's hard to imagine what
you could object to in code that only builds upon Boost.Serialization.
If we're not going to modify the existing library we could even keep
out of your directories and namespaces.

> The crux of my argument is that I believe that the kinds of
> extensions you want to implement can best be done without altering
> the current library.

Yes, that's very clear.

> I'm willing to be proved wrong with a counter example - but the last
> didn't qualify in my opinion.

There's no proof, and never will be. When we've come to an
understanding about consequences of not altering the core, you'll
either decide they're are bad enough to warrant an alteration, or you
won't. It's a judgement call.

>> It might be a good idea for you to clearly define the intended
>> scope of the library... Depending on your answer, we might
>> indeed be barking up the wrong tree.
>
> The very first sentence of the Overview of the Documentation states:
>
> "Here, we use the term "serialization" to mean the reversible deconstruction
> of an arbitrary set of C++ data structures to a sequence of bytes. Such a
> system can be used to reconstitute an equivalent structure in another
> program context. Depending on the context, this might used implement object
> persistence, remote parameter passing or other facility. In this system we
> use the term "archive" to refer to a specific rendering of this stream of
> bytes. This could be a file of binary data, text data, XML, or some other
> created by the user of this library. "
>
> I'm not sure I can make a better statement than that regarding what
> I expected the library to be used for.

You're right, that's pretty specific, and it shows me that our
application is well within the bounds of your intention.

>>> So, I look forward to seeing progress on the following:
>>>
>>> a) better handling of special optimization opportunites
>>> which obtain for certain combinations of data-types and archives.
>>> Hopefully, an elegantl implementation will serve as a model
>>> for other people's pet addiitions.
>
>> I hope we'll be able to show you something elegant very soon.
>
> No need to hurry on my account.

The "soonness" is for our benefit, not yours. :)

>>> b) A protable binary implementation suitable for
>>> such things as MPI messages.
>
>> Portable binary archives and MPI have little relationship to one
>> another. You don't flatten your data into a portable format, ship
>> it in an MPI message that is just a sequence of bytes, and then
>> deserialize. MPI handles portability internally.
>
> I've taken only the most cursory look at MPI. (turns out this may
> change due to some other project). So I won't dispute this. I
> don't see how one could pass information between heterogeneas
> machines without addressing all the issues related to making a
> portable binary archive. Perhaps MPI leaves that part undefined -

No, I'm telling you the opposite. MPI addresses those issues
directly.

> but still it will have to be dealt with somewhere.

Right, MPI deals with it.

>>> I also expect these to take some time and hope they
>>> can be subjected to the boost "process" of public
>>> criticism and refinement. This will take more time
>>> but result in a better product. Hopefully, it will
>>> be less stressful as well - though I doubt it.
>>>
>>> I really am trying to wind down my involvement in the
>>> serialization library.
>>
>> That's a bit alarming, actually. Have you got someone else lined
>> up to maintain it?
>
> I was thinking of Matthias though I've never brought it up

I seriously doubt that would be possible. Matthias has far too many
jobs already.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk