Boost logo

Boost :

From: krempp_at_[hidden]
Date: 2000-11-09 20:11:46


--- In boost_at_[hidden], Beman Dawes <beman_at_e...> wrote:
> Samuel,
>
> Have you looked at http://www.egroups.com/files/boost/persistence/
by Jens
> Maurer?

argh..
I had not noticed persistence in my exploration of the Files sections
in search of things that would interest me..
Now I just had a look at it.
It seems to aim at the same goal as what I had in mind with pickle.
with some differences..

>Several of us have been hoping that Jens will restart his
development
>effort when he gets back from vacation in a few days.

humm.. persistence files are dated June 2000. Impressive vacations..
What kind of job does he have ? maybe it's not too late for me to take
the right career directions... ;-)

I also dicovered the few (4) messages in the list about
persistence. including the one with this link
http://x42.deja.com/getdoc.xp?AN=577017164&CONTEXT=952360117.1170735127&hit=
num=0
which also brings good ideas.
Did anybody get more details on its approach ? I like the fact that
you just describe what are the members, not implement write + load ..
(it only doubles the code, but..)
And also GetName() being specifiable instead of always relying on RTTI
names , is similar to what I considered.
[ Dumping the mangled typeid(T).name() is unreadable...]

BTW, why printing type name only when writing to file ? it is always a
very valuable information about the written object.
shift_writer should do it by default..

Here are the first critics I could thought of :

1_ Persistence lacks nesting of data within convenient block markers.
 saving a vector<vector<float> > creates output unreadable by human
without nesting.
Maybe human readability is not a concern for save_file.
But I take it as a primary goal..
OTOH, implementing a full XML-like nesting system might be too verbose
in some situations.
But we may find a way to specify the desired level of decoration.
Maybe through a template parameter class 'Decorater' ? It would hold
methods for the different occasions when printing additional
'human-helpers' characters makes sense
eg :
-iteration in a container.
  nth iteration could print '#n'
-begin/end of a container
 could print { and },
 or XML-like "</"+description_name()+">.
maybe some "\n" from time to time..

  And then it would also need to indicate how to skip those additional
characters, when reading, by methods called at the same occasions.
Maybe it sounds unclear, but in my mind it is quite precise..

2_
prevention of locale problems resumes to
      if(std::isprint(*it))
so, ok, the shift_writer wont send characters that are unprintable
according to the current locale (which can differ from the stream's
locale, BTW..) to the stream.

But it is drastically too strict.
Since you only deal with normal chars, there must be a better way to
output chars like "Arthur est bête" without stupidly translating ê
into octal code.
 If the writer writes to a file, all chars can be saved without
problems.
getting rid of this if(isprint(*it)) is the best thing to do.
(let the program used to view the dumped file decide what is
printable, he will know much more than C++'s current implementations)

in fact, replacing "John Doe" by "bête"
wonderfully results in core dump.. (g++ version 2.97 20001103,
stdc++-v3)
data.txt shows that ê gets mistakenly translated in :
"\37777777752" (wonder how this can be the octal code for the
value of ê..)
So the database["bête"].cars(..) throws.
Replacing isprint by true works beautifully, of course, and gives
a much prettier data.txt.
In most cases I think I would prefer writer to send each char as is,
[when not overlapping on parsing symbols]

If the writer writes to terminal, or even to printer, then..
we might need to translate unprintables.. but I hope we can find a
way to do it *only* in this case.

Locale coud also apply to arithmetic types.
If you dump x=123*1000+0.01; within a locale and you get
"123.000,01"
you wont be able to read it with a "C" Locale.
so maybe, either always set "C" Locale, either include locale name in
the dump.
Even if this locale name might not be supported in another
environment, I would prefer the second way, and be able to read
numbers formatted the way I want, if I wish so..
(users should then use "C" locale if they want the dumped file to be
loadable everywhere. But they still have the coice to use "fr_FR" and
get the numbers, dates, and everything, according to their tastes. )

3_
binary could be made a bit portable...

4_
I prefer loading/saving via calling operator << and >> on an object.
I find that
boost::ipickle<boost::shift_reader> pic(std::ifstream("data.txt"));
pic>>database;

makes nicer code (without even needing a wrapper) than
  boost::load_file<boost::shift_reader>(database, "data.txt");
even more when you use pic several times consecutively...

[ supposing shift_reader is modified to always print some
implementation of name_of_type<T>() as a first word . ]

-- 
Sam

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk