The tutorial_pba_0.cpp sample program uses a boost::archive::portable_binary_oarchive object attached to a standard output file stream to store a couple of variables of primitive types (bool, char, integer numbers, floating numbers) and even a std::string.
/** tutorial_pba_0.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This quick start example shows how to store some variables * of basic types (bool, integer, floating point numbers, STL string) * using the portable binary archive format associated to a * standard output file stream. * */ #include <string> #include <fstream> #include <boost/cstdint.hpp> #include <boost/archive/portable_binary_oarchive.hpp> int main (void) { // The name for the example data file : std::string filename = "pba_0.data"; // Some variables of various primitive types : bool b = true; char c = 'B'; uint32_t answer = 42; float computing_time = 7.5e6; double e = 2.71828182845905; std::string slogan = "DON'T PANIC"; // Open an output file stream in binary mode : std::ofstream fout (filename.c_str (), std::ios_base::binary); { // Create an output portable binary archive attached to the output file : boost::archive::portable_binary_oarchive opba (fout); // Store (serializing) variables : opba & b & c & answer & computing_time & e & slogan; } return 0; } // end of tutorial_pba_0.cpp
The compiled executable creates the pba_0.data file which contains the following bytes:
127 1 9 1 84 1 66 1 42 4 192 225 228 74 8 116 87 20 139 10 191 5 64 1 11 68 79 78 39 84 32 80 65 78 73 67This format is explained in details below.
Note:
The tutorial_pba_1.cpp sample program uses a boost::archive::portable_binary_iarchive object attached to a standard input file stream in order to load the data previously stored by the tutorial_pba_0.cpp program in the pba_0.data file.
/** tutorial_pba_1.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization package. * * This quick start example shows how to load some variables * of basic types (bool, integer, floating point numbers, STL string) * using the portable binary archive format associated to a * standard input file stream. * */ #include <iostream> #include <string> #include <fstream> #include <boost/cstdint.hpp> #include <boost/archive/portable_binary_iarchive.hpp> int main (void) { using namespace std; // The name for the example data file : string filename = "pba_0.data"; // Some variables of various types : bool b; char c; uint32_t answer; float computing_time; double e; string slogan; // Open an input file stream in binary mode : ifstream fin (filename.c_str (), ios_base::binary); { // Create an input portable binary archive attached to the input file : boost::archive::portable_binary_iarchive ipba (fin); // Loading (de-serializing) variables using the same // order than for serialization (see tutorial_pba_0.cpp) : ipba & b & c & answer & computing_time & e & slogan; } cout.precision (15); cout << "Variable 'b' is : " << b << " " << "(bool)" << endl; cout << "Variable 'c' is : '" << c << "' " << " " << "(char)" << endl; cout << "Variable 'answer' is : " << answer << " " << "(unsigned 32-bit integer)" << endl; cout << "Variable 'computing_time' is : " << computing_time << " " << "(single precision 32-bit float)" << endl; cout << "Variable 'e' is : " << e << " " << "(double precision 64-bit float)" << endl; cout << "Variable 'slogan' is : \"" << slogan << "\" " << "(std::string)" << endl; return 0; } // end of tutorial_pba_1.cpp
The executable reads the pba_0.data file and deserializes its contents in the same order it has been stored. It then prints the restored values of the variables:
Variable 'b' is : 1 (bool) Variable 'c' is : 'B' (char) Variable 'answer' is : 42 (unsigned 32-bit integer) Variable 'computing_time' is : 7500000 (single precision 32-bit float) Variable 'e' is : 2.71828182845905 (double precision 64-bit float) Variable 'slogan' is : "DON'T PANIC" (std::string)
This section aims to give some details about the binary format of portable binary archives (PBA). We will analyse the byte contents of the sample binary archive pba_0.data file created by the tutorial_pba_0.cpp program (see the previous section).
Like any other archive format within Boost/Serialization, a PBA starts with a header (this is the default behaviour but it is possible to deactivate the use of this header using a special flag at construction, see this example). This header is made of two informations :
127
The PBA encoding of integer numbers uses the following scheme: <size> <content>, where first the size stores the minimal number of non zero bytes needed to store the binary representation of the integer value; then the bytes corresponding to the content are stored starting from the less significant ones (see also this example). For the library version number we have here:
1 9where 1 is the number of byte needed to store the value 9 which comes with the Serialization library for Boost version 1.47.0. Here, the 9 value being less than 256, one unique byte is enough to store this number.
Now we are done with the header, let's have a look on the serialized data !
1 84
1 66
1 42This scheme results in saving 2 bytes compared to the size of the transient value.
01001010 | 11100100 | 11100001 | 11000000 |
74 | 228 | 225 | 192 |
4 192 225 228 74
01000000 | 00000101 | 11011111 | 00001010 | 10001011 | 00010100 | 01010111 | 01110100 |
64 | 5 | 191 | 10 | 139 | 20 | 87 | 116 |
8 116 87 20 139 10 191 5 64
1 11
D | O | N | ' | T | P | A | N | I | C | |
68 | 79 | 78 | 39 | 84 | 32 | 80 | 65 | 78 | 73 | 67 |
68 79 78 39 84 32 80 65 78 73 67
Now the contents of the pba_0.data file can be fully understood :
127 1 9 1 84 1 66 1 42 4 192 225 228 74 8 116 87 20 139 10 191 5 64 1 11 68 79 78 39 84 32 80 65 78 73 67More details about the format (non finite floating point values, negative integer numbers) will be given in the sample codes below.
The PBA has been designed in the aims to handle single and double precision floating point numbers, including non-finite and special values:
The tutorial_pba_2.cpp sample program illustrates the use of such special cases while serializing single precision floating point numbers:
/** tutorial_pba_2.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This sample program shows how to use a portable binary archive * to store/load floating point numbers including non-finite and * special (denormalized) values. * */ #include <string> #include <fstream> #include <limits> #include <boost/archive/portable_binary_oarchive.hpp> #include <boost/archive/portable_binary_iarchive.hpp> int main (void) { using namespace std; // The name for the example data file : string filename = "pba_2.data"; { // A normal single precision floating point number : float pi = 3.14159265; // Single precision zeroed floating point number : float zero = 0.0; // A denormalized single precision floating point number : float tiny = 1.e-40; // A single precision floating point number with `+Infinity' value : float plus_infinity = numeric_limits<float>::infinity (); // A single precision floating point number with `-Infinity' value : float minus_infinity = -numeric_limits<float>::infinity (); // A single precision `Not-a-Number' (NaN): float nan = numeric_limits<float>::quiet_NaN (); // Open an output file stream in binary mode : ofstream fout (filename.c_str (), ios_base::binary); { // Create an output portable binary archive attached to the output file : boost::archive::portable_binary_oarchive opba (fout); // Store (serialize) variables : opba & pi & zero & tiny & plus_infinity & minus_infinity & nan; } } { // Single precision floating point numbers to be loaded : float x[6]; // Open an input file stream in binary mode : ifstream fin (filename.c_str (), ios_base::binary); { // Create an input portable binary archive attached to the input file : boost::archive::portable_binary_iarchive ipba (fin); // Load (de-serialize) variables using the same // order than for serialization : for (int i = 0; i < 6; ++i) { ipba & x[i]; } } // Print : for (int i = 0; i < 6; ++i) { cout.precision (8); cout << "Loaded x[" << i << "] = " << x[i]; switch (fp::fpclassify(x[i])) { case FP_NAN: cout << " (NaN)"; break; case FP_INFINITE: cout << " (infinite)"; break; case FP_SUBNORMAL: cout << " (denormalized)"; break; case FP_NORMAL: cout << " (normalized)"; break; } cout << endl; } } return 0; } // end of tutorial_pba_2.cpp
The pba_2.data output data file thus contains the following bytes:
127 1 9 4 219 15 73 64 0 3 194 22 1 4 0 0 128 127 4 0 0 128 255 4 255 255 255 127where:
64 | 73 | 15 | 219 |
01000000 | 01001001 | 00001111 | 11011011 |
1 | 22 | 194 | |
00000000 | 00000001 | 00010110 | 11000010 |
127 | 128 | 0 | 0 |
01111111 | 10000000 | 00000000 | 00000000 |
255 | 128 | 0 | 0 |
11111111 | 10000000 | 00000000 | 00000000 |
127 | 255 | 255 | 255 |
01111111 | 11111111 | 11111111 | 11111111 |
One can ask a PBA to reject non-finite values. This is done by passing the boost::archive::no_infnan flag to the constructor of the output archive. Note that in this case, denormalized values are still accepted, but infinite and NaNs aren't.
The tutorial_pba_3.cpp sample program that illustrates this special case:
/** tutorial_pba_3.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This sample program shows how to use a portable binary archive * and prevent the serialization of non-finite floating numbers. * */ #include <string> #include <fstream> #include <limits> #include <boost/archive/portable_binary_oarchive.hpp> int main (void) { using namespace std; // The name for the example data file : string filename = "pba_3.data"; try { // An array of single precision floating numbers: float x[5]; x[0] = 3.14159; // Pi x[1] = 6.022e22; // Avogadro constant x[2] = 1.6e-19; // Electron charge magnitude x[3] = 1.e-40; // A tiny (denormalized) value x[4] = numeric_limits<float>::infinity (); // This will fail while serializing... // Open an output file stream in binary mode : ofstream fout (filename.c_str (), ios_base::binary); { // Create an output portable binary archive attached to the output file, // using the special 'boost::archive::no_infnan' flag : boost::archive::portable_binary_oarchive opba (fout, boost::archive::no_infnan); // Store (serialize) variables : for (int i = 0; i < 5; ++i) { clog << "Serializing value : " << x[i] << " ... "; opba & x[i]; clog << "Ok !" << endl; } } } catch (exception & x) { cerr << "ERROR: " << x.what () << endl; return 1; } return 0; } // end of tutorial_pba_3.cpp
We can check that the PBA now throws an exception as soon as it encounters a non finite floating point value during the serialization process:
Serializing value : 3.14159 ... Ok ! Serializing value : 6.022e+22 ... Ok ! Serializing value : 1.6e-19 ... Ok ! Serializing value : 9.99995e-41 ... Ok ! Serializing value : inf ... ERROR: serialization of illegal floating point value: inf
The PBA obviously handles integer numbers. Unfortunately, C/C++ does not garantee the portable size of its primitive integer types (short, int, long... and their unsigned versions). It depends on the architecture (32-bit/64-bit) and the compiler.
The Boost library addresses this issue through a collection of typedefs for integer types of common sizes. This technique is supposed to allow the manipulation of integer variables in a portable way, typically with text or XML archives. So, we are generally encouraged to use the boost/cstdint.hpp header file and the typedefs defined therein.
Due to its encoding scheme of integer numbers, the PBA does not strictly need such technique to ensure a correct behaviour while (de)serializing integer numbers. This is because the little endian encoding approach allows to only store the non-zero bytes. It is thus possible to serialize a value using one integer type (short int) and then deserialize it using another integer type (long long).
However, for a strict and safe portable behaviour of PBA, we recommend that, in most cases, the user should systematically use such typedefs for all serializable integer values. This applies particularly for member attributes in structs and classes and should allows the transparent switching to another kind of archive (text, XML) thanks to the serialize template method.
The tutorial_pba_4.cpp sample program illustrates the serialization/deserialization of 8-bit, 16-bit, 32-bit and 64-bit integer numbers:
/** tutorial_pba_4.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This sample program shows how to use a portable binary archive * to store/load integer numbers of various sizes using the Boost * portable integer typedefs. * */ #include <string> #include <fstream> #include <boost/cstdint.hpp> #include <boost/archive/portable_binary_oarchive.hpp> #include <boost/archive/portable_binary_iarchive.hpp> int main (void) { using namespace std; // The name for the example data file : string filename = "pba_4.data"; { // Some integer numbers : bool t = true; char c = 'c'; unsigned char u = 'u'; int8_t b = -3; // char uint8_t B = +6; // unsigned char int16_t s = -16; uint16_t S = +32; int32_t l = -128; uint32_t L = +127; int64_t ll = -1024; uint64_t LL = +2048; // Open an output file stream in binary mode : ofstream fout (filename.c_str (), ios_base::binary); { // Create an output portable binary archive attached to the output file : boost::archive::portable_binary_oarchive opba (fout); // Store (serialize) variables : opba & t & c & u & b & B & s & S & l & L & ll & LL; } } { // Single precision floating numbers to be loaded : // Some integer numbers : bool t; char c; unsigned char u; int8_t b; uint8_t B; int16_t s; uint16_t S; int32_t l; uint32_t L; int64_t ll; uint64_t LL; // Open an input file stream in binary mode : ifstream fin (filename.c_str (), ios_base::binary); { // Create an input portable binary archive attached to the input file : boost::archive::portable_binary_iarchive ipba (fin); // Load (de-serialize) variables using the same // order than for serialization : ipba & t & c & u & b & B & s & S & l & L & ll & LL; } clog << "t = " << t << " (bool)" << endl; clog << "c = '" << c << "' (char)" << endl; clog << "u = '" << u << "' (unsigned char)" << endl; clog << "b = " << (int) b << " (int8_t)" << endl; clog << "B = " << (int) B << " (uint8_t)" << endl; clog << "s = " << s << " (int16_t)" << endl; clog << "S = " << S << " (uint16_t)" << endl; clog << "l = " << l << " (int32_t)" << endl; clog << "L = " << L << " (uint32_t)" << endl; clog << "ll = " << ll << " (int64_t)" << endl; clog << "LL = " << LL << " (uint64_t)" << endl; } return 0; } // end of tutorial_pba_4.cpp
The resulting PBA file is:
127 1 9 1 84 1 99 1 117 255 253 1 6 255 240 1 32 255 128 1 127 254 0 252 2 0 8where:
Note that this coding scheme optimizes the number of streamed bytes. Particularly, it discards the leading zero-ed bytes (MSB) of the binary encoding of any integer value in order to save storage. Also we recall that the exact 0 value (zero or false for a boolean data) is always encoded using a unique 0 byte (zero optimization). Note this approach is also used for floating point numbers.
In some case, we don't want to serialize some data in a file (std::ofstream), but we simply plan to stream it in a memory buffer.
The tutorial_pba_5.cpp sample program makes use of a memory buffer implemented with a STL vector of characters. The PBA is associated to this buffer thanks to a special streaming interface mechanism provided by the Boost/Iostreams library. With such technique one can stream serializable data in some memory buffer in place of a file :
/** tutorial_pba_5.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This sample program shows how to use a portable binary archive * to store/load data in a memory buffer. * */ #include <string> #include <vector> #include <boost/iostreams/stream.hpp> #include <boost/iostreams/device/back_inserter.hpp> #include <boost/iostreams/device/array.hpp> #include <boost/cstdint.hpp> #include <boost/archive/portable_binary_oarchive.hpp> #include <boost/archive/portable_binary_iarchive.hpp> int main (void) { using namespace std; // The memory buffer is implemented using a STL vector : typedef std::vector<char> buffer_type; buffer_type buffer; { // Some data to be stored : bool t = true; char c = 'c'; int16_t s = +16; int32_t l = -128; int64_t ll = +10000000000; float pi = 3.14159; double nan = numeric_limits<double>::quiet_NaN (); string hello = "World !"; buffer.reserve (1024); // pre-allocate some memory // The output stream interface to the buffer : boost::iostreams::stream<boost::iostreams::back_insert_device<buffer_type> > output_stream (buffer); { // Create an output portable binary archive attached to the output file : boost::archive::portable_binary_oarchive opba (output_stream); // Store (serialize) variables : opba & t & c & s & l & ll & pi & nan & hello; } } clog << "Buffer content is " << buffer.size () << " bytes : " << endl << " "; for (int i = 0; i < buffer.size (); ++i) { clog << (int) ((unsigned char) buffer[i]) << ' '; if ((i + 1) % 20 == 0) clog << endl << " "; } clog << endl; { // Some data to be loaded : bool t; char c; int16_t s; int32_t l; int64_t ll; float pi; double nan; string hello; // The input stream interface to the buffer : boost::iostreams::stream<boost::iostreams::array_source> input_stream (&buffer[0], buffer.size ()); { // Create an input portable binary archive attached to the input file : boost::archive::portable_binary_iarchive ipba (input_stream); // Load (de-serialize) variables : ipba & t & c & s & l & ll & pi & nan & hello; } clog << "Loaded values from the buffer are: " << endl; clog << " t = " << t << " (bool)" << endl; clog << " c = '" << c << "' (char)" << endl; clog << " s = " << s << " (int16_t)" << endl; clog << " l = " << l << " (int32_t)" << endl; clog << " ll = " << ll << " (int64_t)" << endl; clog << " pi = " << pi << " (float)" << endl; clog << " nan = " << nan << " (double)" << endl; clog << " hello = \"" << hello << "\" (std::string)" << endl; } return 0; } // end of tutorial_pba_5.cpp
After the storing of data in the archive, the content of the buffer of characters is printed:
Buffer content is 40 bytes : 127 1 9 1 84 1 99 1 16 255 128 5 0 228 11 84 2 4 208 15 73 64 8 255 255 255 255 255 255 255 127 1 7 87 111 114 108 100 32 33 Loaded values from the buffer are: t = 1 (bool) c = 'c' (char) s = 16 (int16_t) l = -128 (int32_t) ll = 10000000000 (int64_t) pi = 3.14159 (float) nan = nan (double) hello = "World !" (std::string)Again the PBA encoding scheme can be easily interpreted. This is let as an exercise.
You may have a look on the tutorial_pba_6.cpp program that shows a possible — and provocative — combined usage of the Boost/Serialization concepts, the Boost/Iostreams facilities and the PBA; it enables the copy of an object of a non-copyable class.
/** tutorial_pba_6.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This sample program shows how to use a portable binary archive * associated to a memory buffer to copy a non-copyable object. * */ #include <iostream> #include <string> #include <sstream> #include <vector> #include <boost/utility.hpp> #include <boost/iostreams/stream.hpp> #include <boost/iostreams/device/back_inserter.hpp> #include <boost/iostreams/device/array.hpp> #include <boost/cstdint.hpp> #include <boost/archive/portable_binary_oarchive.hpp> #include <boost/archive/portable_binary_iarchive.hpp> using namespace std; /* A foo noncopyable class */ struct foo : boost::noncopyable { uint32_t status; double value; double special; string to_string () const { ostringstream sout; sout << "foo={status=" << status << "; value=" << value << "; special=" << special<< "}"; return sout.str(); } template<class Archive> void serialize (Archive & ar, const unsigned int version) { ar & status; ar & value; ar & special; return; } }; // A templatized copy function for Boost/Serialization equipped classes. // Here we use PBAs associated to a memory buffer : template <class Serializable> void copy (const Serializable & source, Serializable & target) { namespace io = boost::iostreams; namespace ba = boost::archive; if (&source == &target) return; // self-copy guard typedef std::vector<char> buffer_type; buffer_type buffer; buffer.reserve (1024); { io::stream<io::back_insert_device<buffer_type> > output_stream (buffer); ba::portable_binary_oarchive opba (output_stream); opba & source; } { io::stream<io::array_source> input_stream (&buffer[0], buffer.size ()); ba::portable_binary_iarchive ipba (input_stream); ipba & target; } return; } int main (void) { // Some instance of the 'foo' class : foo dummy; dummy.status = 1; dummy.value = 3.14159; dummy.special = numeric_limits<double>::quiet_NaN (); clog << "dummy is : " << dummy.to_string () << endl; // Another instance of the 'foo' class : foo clone; /* The following instruction is forbidden because foo inherits 'boost::noncopyable' : clone = dummy; // this ends in a compilation error. */ // Anyway, we can use this workaround : copy (dummy, clone); clog << "clone is : " << clone.to_string () << endl; return 0; } // end of tutorial_pba_6.cpp
Remark : if a class has been made non-copyable at design, it is likely for a good reason; so it is not recommended to workaround this trait using such a trick, unless you know what you are doing and all the consequences !
In some circonstances, it may be useful to use the Boost text and XML archives in somewhat portable way. For example, we may want to benefit of the XML archive's human-friendly format for debugging purpose before to switch to the PBA for production runs. However, the text and XML archives provided by the Boost serialization library are not strictly portable, particularly because they does not support the serialization of non-finite floating point numbers. This is because the serialization of floating point numbers depends on some formatting features of standard I/O streams. See the tutorial_pba_7.cpp sample program below :
/** tutorial_pba_7.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This example shows how the default behaviour of standard * I/O streams does not support the read/write operations of * non-finite floating point values in a portable way. * */ #include <string> #include <iostream> #include <sstream> #include <limits> using namespace std; int main (void) { { float x = numeric_limits<float>::infinity (); double y = numeric_limits<double>::quiet_NaN (); cout.precision (8); cout << "x = " << x << endl; cout.precision (16); cout << "y = " << y << endl; } { string input ("inf nan"); istringstream iss (input); float x; double y; iss >> x >> y; if (! iss) { cerr << "Cannot read 'x' or 'y' : non finite values are not supported !" << endl; } } return 0; } // end of tutorial_pba_7.cpp
Depending on the system, one can get some various representation respectively for the infinity and NaN values :
1.#INFand
-1.#IND
infand
nan
Hopefully this issue can be solved by configuring the I/O streams with some special locale features provided by Boost (see this link).
The tutorial_pba_8.cpp program shows how this can be achieved through the use of special resources from the boost/archive/codecvt_null.hpp and boost/math/special_functions/nonfinite_num_facets.hpp headers :
/** tutorial_pba_8.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This example shows how to store some variables * of basic types (bool, integer, floating point numbers, STL string) * using the text or XML archive format associated to a * standard output file stream supporting portable non-finite * floating point values. * */ #include <string> #include <fstream> #include <limits> #include <locale> #include <boost/cstdint.hpp> #include <boost/archive/xml_oarchive.hpp> #include <boost/archive/text_oarchive.hpp> #include <boost/serialization/nvp.hpp> #include <boost/archive/codecvt_null.hpp> #include <boost/math/special_functions/nonfinite_num_facets.hpp> using namespace std; void do_text_out (void) { // The name for the example data text file : string filename = "pba_8.txt"; // Some variables of various primitive types : bool b = true; char c = 'B'; uint32_t answer = 42; float value = numeric_limits<float>::infinity (); double precision = numeric_limits<double>::quiet_NaN (); string question = "What makes you think she's a witch?"; // Open an output file stream : ofstream fout (filename.c_str ()); // Prepare the output file stream for inf/NaN support : locale default_locale (locale::classic (), new boost::archive::codecvt_null<char>); locale infnan_locale (default_locale, new boost::math::nonfinite_num_put<char>); fout.imbue (infnan_locale); { // Create an output text archive attached to the output file : boost::archive::text_oarchive ota (fout, boost::archive::no_codecvt); // Store (serializing) variables : ota & b & c & answer & value & precision & question; } return; } void do_xml_out (void) { // The name for the example data XML file : string filename = "pba_8.xml"; // Some variables of various primitive types : bool b = true; char c = 'B'; uint32_t answer = 42; float value = numeric_limits<float>::infinity (); double precision = numeric_limits<double>::quiet_NaN (); string question = "What makes you think she's a witch?"; // Open an output file stream : ofstream fout (filename.c_str ()); // Prepare the output file stream for inf/NaN support : locale default_locale (locale::classic (), new boost::archive::codecvt_null<char>); locale infnan_locale (default_locale, new boost::math::nonfinite_num_put<char>); fout.imbue (infnan_locale); { // Create an output text archive attached to the output file : boost::archive::xml_oarchive oxa (fout, boost::archive::no_codecvt); // Store (serializing) variables : oxa & BOOST_SERIALIZATION_NVP(b) & BOOST_SERIALIZATION_NVP(c) & BOOST_SERIALIZATION_NVP(answer) & BOOST_SERIALIZATION_NVP(value) & BOOST_SERIALIZATION_NVP(precision) & BOOST_SERIALIZATION_NVP(question); } return; } int main (void) { do_text_out (); do_xml_out (); return 0; } // end of tutorial_pba_8.cpp
The program creates two output files :
22 serialization::archive 9 1 66 42 inf nan 35 What makes you think she's a witch?
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="9"> <b>1</b> <c>66</c> <answer>42</answer> <value>inf</value> <precision>nan</precision> <question>What makes you think she's a witch?</question> </boost_serialization>
The tutorial_pba_9.cpp program deserializes the data from the text and XML archive files (respectively pba_8.txt and pba_8.xml) and prints the restored variables :
Loaded values from text archive are: b = 1 c = 'B' answer = 42 value = inf precision = nan question = "What makes you think she's a witch?" Loaded values from XML archive are: b = 1 c = 'B' answer = 42 value = inf precision = nan question = "What makes you think she's a witch?"
/** tutorial_pba_9.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This example shows how to load some variables of basic * types (bool, char, integer, floating point numbers, STL string) * using the text or XML archive format associated to a * standard file input stream supporting portable non-finite * floating point values. * */ #include <string> #include <fstream> #include <limits> #include <locale> #include <boost/cstdint.hpp> #include <boost/archive/xml_iarchive.hpp> #include <boost/archive/text_iarchive.hpp> #include <boost/serialization/nvp.hpp> #include <boost/scoped_ptr.hpp> #include <boost/archive/codecvt_null.hpp> #include <boost/math/special_functions/nonfinite_num_facets.hpp> using namespace std; void do_text_in (void) { // The name for the example data text file : string filename = "pba_8.txt"; // Some variables of various primitive types : bool b; char c; uint32_t answer; float value; double precision; string question; // Open an input file stream : ifstream fin (filename.c_str ()); // Prepare the input file stream for inf/NaN support : locale default_locale (locale::classic (), new boost::archive::codecvt_null<char>); locale infnan_locale (default_locale, new boost::math::nonfinite_num_get<char>); fin.imbue (infnan_locale); { // Create an input text archive attached to the input file : boost::archive::text_iarchive ita (fin, boost::archive::no_codecvt); // Store (serializing) variables : ita & b & c & answer & value & precision & question; } clog << "Loaded values from text archive are: " << endl; clog << " b = " << b << endl; clog << " c = '" << c << "'" << endl; clog << " answer = " << answer << endl; clog << " value = " << value << endl; clog << " precision = " << precision << endl; clog << " question = \"" << question << "\"" << endl; return; } void do_xml_in (void) { // The name for the example data text file : string filename = "pba_8.xml"; // Some variables of various primitive types : bool b; char c; uint32_t answer; float value; double precision; string question; // Open an input file stream : ifstream fin (filename.c_str ()); // Prepare the input file stream for inf/NaN support : locale default_locale (locale::classic (), new boost::archive::codecvt_null<char>); locale infnan_locale (default_locale, new boost::math::nonfinite_num_get<char>); fin.imbue (infnan_locale); { // Create an output text archive attached to the output file : boost::archive::xml_iarchive ixa (fin, boost::archive::no_codecvt); // Store (serializing) variables : ixa & BOOST_SERIALIZATION_NVP(b) & BOOST_SERIALIZATION_NVP(c) & BOOST_SERIALIZATION_NVP(answer) & BOOST_SERIALIZATION_NVP(value) & BOOST_SERIALIZATION_NVP(precision) & BOOST_SERIALIZATION_NVP(question); } clog << "Loaded values from XML archive are: " << endl; clog << " b = " << b << endl; clog << " c = '" << c << "'" << endl; clog << " answer = " << answer << endl; clog << " value = " << value << endl; clog << " precision = " << precision << endl; clog << " question = \"" << question << "\"" << endl; return; } int main (void) { do_text_in (); do_xml_in (); return 0; } // end of tutorial_pba_9.cpp
The tutorial_pba_10.cpp program illustrates how to serialize, then deserialize, a class from a PBA associated to a GZIP compressed file stream, thanks to a technique provided by the Boost/Iostreams library. The class contains a large STL vector of double precision floating point numbers with arbitrary values:
/** tutorial_pba_10.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This example shows how use PBAs combined with on-the-fly * compressed I/O streams. * */ #include <string> #include <fstream> #include <limits> #include <vector> #include <boost/cstdint.hpp> #include <boost/archive/portable_binary_oarchive.hpp> #include <boost/archive/portable_binary_iarchive.hpp> #include <boost/iostreams/filtering_stream.hpp> #include <boost/iostreams/filter/gzip.hpp> #include <boost/serialization/access.hpp> #include <boost/serialization/vector.hpp> using namespace std; class data_type { private: friend class boost::serialization::access; template<class Archive> void serialize (Archive & ar, const unsigned int version); public: void print (ostream & out, const string & title) const; public: vector<double> values; data_type (); }; data_type::data_type () : values () { return; } void data_type::print (ostream & out, const string & title) const { out << endl; out << title << " :" << endl; for (int i = 0; i < this->values.size (); ++i) { out.precision (16); out.width (18); out << this->values [i] << ' ' ; if ((i%4) == 3) clog << endl; } out << endl; return; } template<class Archive> void data_type::serialize (Archive & ar, const unsigned int version) { ar & values; return; } void do_gzipped_out (void) { // The name for the output data file : string filename = "pba_10.data.gz"; // A data structure to be stored : data_type my_data; // Fill the vector with arbitrary (possibly non-finite) values : size_t dim = 1000; my_data.values.reserve (dim); for (int i = 0; i < dim; ++i) { double val = (i + 1) * (1.0 + 3 * numeric_limits<double>::epsilon ()); if (i == 4) val = numeric_limits<double>::quiet_NaN (); if (i == 23) val = numeric_limits<double>::infinity (); if (i == 73) val = -numeric_limits<double>::infinity (); if (i == 90) val = 0.0; my_data.values.push_back (val); } // Print: my_data.print (clog, "Stored data"); // Create an output filtering stream : boost::iostreams::filtering_ostream zout; zout.push (boost::iostreams::gzip_compressor ()); // Open an output file stream in binary mode : ofstream fout (filename.c_str (), ios_base::binary); zout.push (fout); // Save to PBA : { // Create an output portable binary archive attached to the output file : boost::archive::portable_binary_oarchive opba (zout); // Store (serializing) the data : opba & my_data; } // Clean termination of the streams : zout.flush (); zout.reset (); return; } void do_gzipped_in (void) { // The name for the input data file : string filename = "pba_10.data.gz"; // A data structure to be loaded : data_type my_data; // Create an input filtering stream : boost::iostreams::filtering_istream zin; zin.push (boost::iostreams::gzip_decompressor ()); // Open an input file stream in binary mode : ifstream fin (filename.c_str (), ios_base::binary); zin.push (fin); // Load from PBA : { // Create an input portable binary archive attached to the input file : boost::archive::portable_binary_iarchive ipba (zin); // Load (deserializing) the data : ipba & my_data; } // Print: my_data.print (clog, "Loaded data"); return; } int main (void) { do_gzipped_out (); do_gzipped_in (); return 0; } // end of tutorial_pba_10.cpp
The resulting compressed pba_10.data.gz file contains 1,574 bytes. This has to be compared with the size of the plain (uncompressed) binary archive which equals 9,001 bytes:
127 1 9 0 0 2 232 3 0 8 3 0 0 0 0 0 240 63 8 3 0 0 0 0 0 0 64 8 4 0 0 0 0 0 8 64 8 3 0 0 0 0 0 16 64 8 255 255 255 255 255 255 255 127 8 4 0 0 0 0 0 24 64 8 ...which can be interpreted as :
It is also possible to use BZIP2 in a similar fashion (using ressources from the boost/iostreams/filter/bzip2.hpp header in place of boost/iostreams/filter/gzip.hpp).
The tutorial_pba_11.cpp program runs a benchmark test in the aim to compare the relative fastness of PBA and text archives both for read and write operations. It stores then loads a vector of many (107) random double values and prints the associated (de)serialization time for both kinds of archives:
/** tutorial_pba_11.cpp * * (C) Copyright 2011 François Mauger, Christian Pfligersdorffer * * Use, modification and distribution is subject to the Boost Software * License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at * http://www.boost.org/LICENSE_1_0.txt) * */ /** * The intent of this program is to serve as a tutorial for * users of the portable binary archive in the framework of * the Boost/Serialization library. * * This example program compares the times needed to serialize * and deserialize some large amount of data using PBA and * text archives. * */ #include <string> #include <fstream> #include <vector> #include <boost/archive/portable_binary_oarchive.hpp> #include <boost/archive/portable_binary_iarchive.hpp> #include <boost/archive/text_oarchive.hpp> #include <boost/archive/text_iarchive.hpp> #include <boost/serialization/access.hpp> #include <boost/serialization/vector.hpp> #include <boost/random/mersenne_twister.hpp> #include <boost/random/uniform_real_distribution.hpp> #include <boost/timer.hpp> using namespace std; class data_type { private: friend class boost::serialization::access; template<class Archive> void serialize (Archive & ar, const unsigned int version); public: void print (ostream & out, const string & title) const; public: vector<double> values; data_type (); }; data_type::data_type () : values () { return; } void data_type::print (ostream & out, const string & title) const { out << endl; out << title << " :" << endl; bool skip = false; for (int i = 0; i < this->values.size (); ++i) { if ((i >= 12) && (i < (int) this->values.size () - 8)) { if (! skip) out << " ..." << endl; skip = true; continue; } out.precision (16); out.width (18); out << this->values [i] << ' ' ; if ((i%4) == 3) clog << endl; } out << endl; return; } template<class Archive> void data_type::serialize (Archive & ar, const unsigned int version) { ar & values; return; } double do_pba_out (const data_type & a_data) { string filename = "pba_11.data"; ofstream fout (filename.c_str (), ios_base::binary); boost::timer io_timer; { boost::archive::portable_binary_oarchive opba (fout); opba & a_data; } return io_timer.elapsed (); } double do_pba_in (data_type & a_data) { string filename = "pba_11.data"; ifstream fin (filename.c_str (), ios_base::binary); boost::timer io_timer; { boost::archive::portable_binary_iarchive ipba (fin); ipba & a_data; } return io_timer.elapsed (); } double do_text_out (const data_type & a_data) { string filename = "pba_11.txt"; ofstream fout (filename.c_str ()); boost::timer io_timer; { boost::archive::text_oarchive ota (fout); ota & a_data; } return io_timer.elapsed (); } double do_text_in (data_type & a_data) { string filename = "pba_11.txt"; ifstream fin (filename.c_str ()); boost::timer io_timer; { boost::archive::text_iarchive ita (fin); ita & a_data; } return io_timer.elapsed (); } int main (void) { double elapsed_time_pba_out; double elapsed_time_text_out; double elapsed_time_pba_in; double elapsed_time_text_in; data_type my_data; // A data structure to be stored then loaded. { // Fill the vector with random values : size_t dim = 10000000; my_data.values.reserve (dim); boost::random::mt19937 rng; boost::random::uniform_real_distribution<> flat (0.0, 100.0); for (int i = 0; i < dim; ++i) { double val = flat (rng); my_data.values.push_back (val); } my_data.print (clog, "Stored data in PBA and text archive"); } { // Store in PBA : elapsed_time_pba_out = do_pba_out (my_data); } { // Store in text archive : elapsed_time_text_out = do_text_out (my_data); } { my_data.values.clear (); // Load from PBA : elapsed_time_pba_in = do_pba_in (my_data); my_data.print (clog, "Loaded data from PBA"); } { my_data.values.clear (); // Load from text archive : elapsed_time_text_in = do_text_in (my_data); my_data.print (clog, "Loaded data from text archive"); } clog << "PBA store I/O elapsed time : " << elapsed_time_pba_out << " (second)" << endl; clog << "Text store I/O elapsed time : " << elapsed_time_text_out << " (second)" << endl; clog << "PBA load I/O elapsed time : " << elapsed_time_pba_in << " (second)" << endl; clog << "Text load I/O elapsed time : " << elapsed_time_text_in << " (second)" << endl; return 0; } // end of tutorial_pba_11.cpp
On a 1.60 GHz processor running gcc 4.5.2 on Linux 2.6.38, the result is the following:
PBA store I/O elapsed time : 1.86 (second) Text store I/O elapsed time : 22.66 (second) PBA load I/O elapsed time : 1.53 (second) Text load I/O elapsed time : 19.71 (second)It this simple case, the use of portable binary archives is faster by at least a factor 10 compared to the traditional Boost text archives. This is a significant saving in time. These performances are highly desirable in the typical framework of scientific/computing activities where large amounts of data are accessed through files. The PBA concept is thus a valuable candidate for such applications.
One can also consider the sizes of the resulting archive files:
Revised 2011-11-07
© Copyright François Mauger,
Christian Pfligersdorffer 2011.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)