Boost logo

Boost Users :

Subject: Re: [Boost-users] [serialization] how can load the nth element usingbinary_archive?
From: François Mauger (mauger_at_[hidden])
Date: 2011-10-23 13:13:15


On 23/10/2011 18:24, Robert Ramey wrote:
> gongyiling wrote:
>> Hello, Everyone:
>>
>> boost::serialization is a cool library. I use it in my project. I
>> encounter a problem now.
>> I serialized a array of objects to file as following:
>>
>> std::ofstream ofs(filename, std::ios_base::binary);
>> boost::binary_oarchive oar(ofs);
>> oar<< obj[0]<< obj[1]<< ...<< obj[n-1];
>>
>> Now I want to load the obj[i] (i>=0&& i<n) from file. I tried
>> recording obj[i]'s
>> position in file, then set file stream's pointer to that position and
>> trying to load it, but
>> failed with exception input_stream_error, can I by any means doing
>> that?
>> here is my minimun code segments:
>>
>> struct word_block
>> {
>> std::string word;
>> template<typename Archive>
>> void serialize(Archive& ar, const unsigned int file_version )
>> {
>> ar& word;
>> }
>> };
>>
>> int main(int argc, char* argv[])
>> {
>> const char* filename = "words.idx";
>> word_block wb1, wb2, wb2_r;
>>
>> std::ofstream ofs(filename, std::ios_base::binary);
>> boost::archive::binary_oarchive oar(ofs);
>> oar<< wb1;
>> size_t wb2_pos = ofs.tellp(); //record the position of wb2.
>> oar<< wb2;
>> ofs.close();
>>
>> std::ifstream ifs(filename, std::ios_base::binary);
>> boost::archive::binary_iarchive iar(ifs);
>> ifs.seekg(wb2_pos); //set file pointer to wb2.
>> iar>> wb2_r; //failed with exception input_stream_error.
>> return 0;
>> }
>>
>> Any idea is appreciated!
>
> I can't see a way to do this without making non-trivial changes
> in the serialization library
>

Hi

The Boost/serialization library outputs 'sequential access' files.
However this is not a blocking point.
To achieve the feature you want, it is possible to use a third party I/O
library like HDF5 (or ROOT from CERN) that provides random access to
some arbitrary data structures (typically buffer of chars) from data files.

A possible solution could be (in fact *is*):

1 - associate a Boost output archive to some memory buffer (typically a
STL vector<char>) through some special 'ostream' (found in Boost/iostreams)
2 - serialize one unique object from your array in the archive
    in such a way there is *one* archive *per* object
    and *not* one archive for the whole array of objects
3 - store the corresponding buffer (size+content) using some native
output operations from the external I/O lib and some mapping key (index,
label...) that will be used for further random access to the buffer.

Deserialization can then be performed :

1 - use some random access input operation from the external lib to
rebuild *the* single memory buffer associated to *the* object you want
to retrieve (per index, per label... depending on the internal "mapping"
of the I/O lib)
2 - associate a Boost input archive to this memory buffer
    that contains the bytes of the Boost archive for only one object
(again Boost/Iostreams is at work).
3 - Finally, use standard Boost/Serialization method to get your object
from the archive.

No need to say it takes a little time to implement make it work
(Boost/iostreams is of great help here) and you need to enter the guts
of another I/O library.
But it works for me with some home-made library that couples Boost with
such high-level third-party I/O library (ROOT I/O lib) and it is quite
efficient.
You thus take the best of the Boost/Serialization library (easy to
write serialization stuff for all your classes in a unified approach)
and from a robust I/O lib (buffered I/O ops, automated storage from/into
splitted files, random access...).

Not sure these comments/ideas may help you today... but if you feel
brave... ;-)

cheers

frc

-- 
François Mauger

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net