using the serialization library with a std::map of Blitz++ arrays

Dear all, I have a code that uses a class which is basically a std::map of int and Blitz++ arrays. (I think that for this matter the Blitz++ array can be consider a plain array.) I need to save a bunch of these objects into a file and read them later during the same run. A typical run will produce a few thousands of these maps, each of them having a size of the order of 50, and each of these ~50 Blitz++ arrays holding a few thousands of doubles. Right now I'm just using the Blitz++ overloaded operator<<(>>) for the output (input), which pretty much runs over the Blitz++ array using the operator<<(>>) for doubles. I guess the serialization library with a binary archive would make a difference in the performance of my code. I've understood from the documentation that the serialization of STL container can be done just by including the proper header, but I must serialize the Blitz++ array by myself. Then I could use whichever Archive (binary in my case) I want to. My question is: is that the only thing I have to do to or I got it wrong? I've also seen an old thread in the mailing lists where it seemed that some people were pushing to include support for Blitz++ (and some other numerical stuff) built-in in the serialization library, has this ever been done ? More generally, would this be the best strategy to get a fast input/output for large numerical arrays or there is something else within Boost I could use ? Best, Ivan PS: I noticed that there was a tutorial on BoostCon08 on the serialization library. Is it possible to get the slides/notes?

Ivan Gonzalez wrote:
PS: I noticed that there was a tutorial on BoostCon08 on the serialization library. Is it possible to get the slides/notes?
Hi Ivan, I did the presentation but there is really no information in the slides. Most of the benefit was the verbal explanation (in my head anyway!) I would recommend you look at the documentation for the library: http://www.boost.org/doc/libs/1_35_0/libs/serialization/doc/index.html Specifically: http://www.boost.org/doc/libs/1_35_0/libs/serialization/doc/wrappers.html should be of interest to your problem serializing Blitz++ arrays. -- Sohail Somani http://uint32t.blogspot.com

On May 13, 2008, at 4:25 PM, Ivan Gonzalez wrote:
More generally, would this be the best strategy to get a fast input/ output for large numerical arrays or there is something else within Boost I could use ?
This is the best strategy Matthias

Thanks very much both Sohail and Matthias for your reply!. I finally got it working (I didn't have Boost 1.35 in any machine.) The increase in the performance is *really* nice: in small, but relevant runs of my code, I got an overall speed-up of around 300% just by serializing the input/output of the Blitz++ arrays. (This small run saves 1 minute in writing and later reading around 10^5 maps holding 10^5 doubles each, compared to the use of operators<<,>> from Blitz++) I'm commenting my experience here for the record. 1) You need Boost 1.35 (prior versions do not have the support for dense arrays that Sohail mentioned.) 2) You need to build Boost serialization (although in some parts of the docs says it is headers only) and link your code against it. 3) There is a bug in the library, a fix is here: http://svn.boost.org/trac/boost/ticket/1822 4) When writing to/reading from a boost::binary_archive open the streams you pass to the archives in the std::ios::binary mode. The code relevant for the serialization of the Blitz++ arrays is shown below. The serialization of the STL container (std::map or whatever) is done by including the proper header (as explained in the serialization docs.) /** * @typedef Defines a matrix type for the example. * Generalizations to higher-order tensors are trivial */ typedef blitz::Array<double, 2> my_Matrix; #ifdef SERIALIZE_WITH_BOOST /** * Functions to serialize array. The data() member function in Blitz++ * gives you a pointer to the first element of the array holding the data. * Be careful if you use a non-default Blitz++ order. * * @see * http://www.boost.org/doc/libs/1_35_0/libs/serialization/doc/wrappers.html */ namespace boost{ namespace serialization{ template<class Archive> inline void save( Archive & ar, const ::my_Matrix & t, const unsigned int file_version) { collection_size_type rows(t.rows()); ar << BOOST_SERIALIZATION_NVP(rows); collection_size_type cols(t.cols()); ar << BOOST_SERIALIZATION_NVP(cols); if (rows*cols) // save only if it is not empty ar << make_array(t.data(), t.size()); } template<class Archive> inline void load( Archive & ar, ::my_Matrix & t, const unsigned int file_version) { collection_size_type rows; ar >> BOOST_SERIALIZATION_NVP(rows); collection_size_type cols; ar >> BOOST_SERIALIZATION_NVP(cols); collection_size_type count; if (rows*cols) // read only if it is not empty { // Blitz++ arrays need to be resized t.resize(rows,cols); ar >> make_array(t.data(), t.size()); } } template<class Archive> inline void serialize( Archive & ar, ::my_Matrix & t, const unsigned int file_version) { split_free(ar, t, file_version); } }; };// namespace boost::serialization #endif

Ivan Gonzalez wrote:
Thanks very much both Sohail and Matthias for your reply!. [snip reply]
You're welcome. And thank you for understanding how open source works :-) -- Sohail Somani http://uint32t.blogspot.com
participants (3)
-
Ivan Gonzalez
-
Matthias Troyer
-
Sohail Somani