Boost logo

Boost Users :

Subject: Re: [Boost-users] boost serialisation file size comparison
From: Francois Mauger (mauger_at_[hidden])
Date: 2009-12-08 07:55:55


Hello Avi,

let's consider the following case: we want to serialize the following
'small' integers: 0 1 2 3 4 5

in ASCII text archive, you get the following readable string:
  "0 1 2 3 4 5"
  this is 11 chars (6 digits + 5 white space) == 11 bytes

in binary, if you use 32 bits integers (int on typical processor):
you get (typically) the following string of 6 ints x 4 bytes == 24
bytes (no need for white space separator in this case).
this is about twice the ASCII storage.

this example shows a special case when the situation you mention could
occur. it depends on your data. anyway generally, it is more efficient
to store in binary.

If you try to compress (gz or whatever) both archives, I expect the
sizes of the files to be comparable, because the basic information
contents is the same.

A good approach to save storage room would be to use int16_t or even
int8_t types if it's enough to suit your needs.

hope it helps

regards

frc

--
 >>> Avi Bahra a écrit :
> I have written some test  for the different kind of serialization 
> archives in my application.
> In the test below the only difference is the kind of archive used
>  ( and use of ios::binary for the binary streams)
> 
> ANode:: ...test_node_tree_persistence_text                       
> file_size: 4014
> ANode:: ...test_node_tree_persistence_binary                   
> file_size: 5351
> ANode:: ...test_node_tree_persistence_portable_binary  file_size: 2878
> 
> What I don't understand is why should the binary archive serialization
> file size be larger than text ?
> 
> -- 
>   Best regards,
> Ta,
>    Avi
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- 
François Mauger
Département de Physique - Université de Caen Basse-Normandie
courriel/e-mail: mauger_at_[hidden]
tél./phone: 02 31 45 25 12 / (+33) 2 31 45 25 12
fax:        02 31 45 25 49 / (+33) 2 31 45 25 49
Adresse/address:
   Laboratoire de Physique Corpusculaire de Caen (UMR 6534)
   ENSICAEN
   6, Boulevard du Marechal Juin
   14050 CAEN Cedex
   FRANCE

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net