Boost logo

Boost Users :

From: Jonathan Turkanis (turkanis_at_[hidden])
Date: 2008-02-26 22:38:18


Hi Mengda,

I'm sorry I didn't see this post sooner. If you include the library name
in your message subject it is more likely that the library author will
respond quickly.

Mengda Wu wrote:
> Hi,
>
> I have a code block trying to output a series of gzip files. I wish
> to have all the files flushed after the block.
> But the files can only be flushed after exiting the whole program.
> The files still have zero size after I try to close them. I am using
> std::vector to store all the pointers to boost filtering_ostream.
> Can you help?

What you are noticing is that data is not being written to disk until
the streams are closed. This is not actually a bug, as I will explain,
but it still may warrant a change to the library.

flush() is just a suggestion; in general you can't force a filter to
output all the filtered data it is currently storing, except at the end
of the stream. There may be internal constraints, depending on the
format of the data output by the filter, that dictate when new data is
available for flushing. For example, an encryption filter might not be
able to output any new characters until its input has length equal to a
multiple of its block size, or until EOF occurs.

In the case of the gzip filters, Boost.Iostreams simply lets zlib
determine when new characters in the filtered sequence are available.
In your example, the compressed text is very short (21 characters) and
it looks like zlib is simply waiting for more input before it spits
anything out. When I run your example with 250K of uncompressed data,
there is output written to disk before the streams are closed.

I have opened a ticket (http://svn.boost.org/trac/boost/ticket/1656)
raising the question whether symmetric filters (including gzip) should
attempt to force the underlying filtering algorithm to spit out as many
  bytes as possible when flush() is called.

> //Open iostreams
>
> char filename[20];
> std::vector<boost::iostreams::filtering_ostream *> os_vector;
> for(i=0; i<4; i++)
> {
> sprintf(filename, "file_%d.gz", i );
> std::ofstream *of = new std::ofstream(filename, std::ios_base::binary);
> boost::iostreams::filtering_ostream* os = new
> boost::iostreams::filtering_ostream;
> os->push(boost::iostreams::gzip_compressor());
> os->push(*of);
> os_vector->push_back(os);
> }
>
> //Output something
> for(i=0; i<4; i++)
> {
> boost::iostreams::filtering_ostream* os = os_vector[i];
> os<<"Output something here" << std::endl;
> }
>
> //Close streams
> for(i=0; i<4; i++)
> {
> boost::iostreams::filtering_ostream* os = os_vector[i];
> os->strict_sync();
> os->pop();
> os->reset();
> }

There are several other problems with this code.

First, os_vector is not a pointer, to os_vector->push_back(os); you
should use os_vector.push_back(os). Second, the dynamically allocated
ofstreams are leaked; when you add them to a filtering stream, it does
not take ownership of them; it merely stores a reference. Third, the
dynamically allocated filtering_ostreams are in danger of being leaked
if an exception is thrown by any of the code following the allocation;
you should consider some other method of storing the streams -- possibly
using a ptr_vector (http://tinyurl.com/3x4yor). Fourth, it is useless to
call pop() immeditately before reset(): pop() removes the last element
in a chain, while reset removes all the elements in a chain.

> Thanks,
> Mengda

Best Regards,

-- 
Jonathan Turkanis
CodeRage
http://www.coderage.com

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net