Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2002-12-12 19:50:31


brangdon_at_[hidden] (Dave Harris) writes:

>> > We could send a binary format through a "uuencode" filter, but a
>> > text format which was natively safe would be neater (and probably
>> > more efficient).
>>
>> Why would it be more efficient?
>
> Because it has more knowledge.
>
> For example, if we write out the number 500 using an alphabet of 64 safe
> characters, it takes 2 characters. If we write it out using all 256
> characters, it still takes 2 of them, but now to make it safe each
> character needs 2 safe characters to represent it, so it takes 4 bytes
> altogether. The double conversion is more verbose because the first part
> loses information.

I follow everything you're saying, but can't see how it supports your
conclusion. This sounds exactly like an argument for uuencoding binary
instead of uuencoding text.

"500" as text is 3 bytes; uuencoded it is 6 bytes. 500 in a
variable-length binary format is 2 bytes; uuencoded it is 4 bytes.

>> "schema ID"?
>
> A term from MFC. It is what the submitted library calls a file_version.
>
>
>> Can you give an example of "containing the mess within the UDT?"
>
> I don't have a good example to hand. Here's a made up one:

<snip, with explanation>

> I don't know what you think of this code - whether it horrifies you
> for being too low level or lacking in design foresight.

Not really; it's very familiar looking. I used to write desktop
apps. ;->

> It is my practical experience. Designs age, and the history accretes
> in the serialisation load routines. I hope that the boost library
> will be able to support this kind of evolution.

Agreed.

> I don't claim that code like this is the best solution, but in
> practice I have found it works.

I'd like to see a better one, if it's out there.

>> It's beginning to sound more and more like the metaclass framework
>> some people have been hinting at.
>
> Do you mean that some framework could handle a history like that
> reflected in the above code, automatically?

No, actually I wasn't thinking of revision history when I wrote that.
Don't have the original message anymore, though...

> Java manages it by storing a snapshot of the class hierarchy (as it was
> when the archive was made) into the archive. That gives it enough
> information to figure out how the hierarchy has changed. However, it can
> lead to rather bloated archives.

I bet!

>> ... assuming there is such a factory method.
>
> The archive has to store something to represent classes, and has to be
> able to create instances of the classes so represented, in order to
> restore polymorphic pointers. That's what I mean by a "factory method". I
> don't mean to imply a particular implementation.

OK.

>> It sounds like your viewpoint on this is very heavily influenced by
>> one particular kind of application.
>
> Yes. Well, less so then my choice of words may have implied. And of
> course in that passage I was discussing a trap that MFC fell
> into. Your earlier comment:
>
> [...] the use of type_info::name() for type identification. Even
> if these were optional components to the library, they could
> provide enormous benefit for some applications.
>
> made it sound like you might make the same mistake. If we use class names
> to identify types, we need to make sure we can rename classes and still
> load old files.

I don't know if I agree. The kinds of applications that could use
type_info::name(), in my experience, might be using the archive as a
cache and just throw it out when this sort of thing happens. Anyway,
I'm not proposing it as a general solution, just as something I'd like
to be sure wasn't prevented.

> But generally, yes, I know what kind of applications I write and I
> hope boost will support. If other people have different
> expectations, shouldn't they write about them? Isn't that what this
> pre-coding discussion is for?

Yep.

> I'm sorry for the length of this post, but now that I've written it,
> maybe you can tell me whether I want a persistency library or a
> serialisation library.

To me, it sounds like you're after persistence, with data structure
evolution between program runs. Serializing/deserializing the data
will probably be an important part of what it takes to get there.

-- 
                       David Abrahams
   dave_at_[hidden] * http://www.boost-consulting.com
Boost support, enhancements, training, and commercial distribution

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk