Boost logo

Boost Users :

Subject: [Boost-users] slow gunzip
From: Andy C (andy.coolware_at_[hidden])
Date: 2015-09-23 20:03:42


Hi,

I am trying to write a simple program which processes large gziped files as
streams. However, it turns out, that for some reason reading is very slow
and it takes almost 10 times more to even count single lines as compared to
a Java equivalent or "gunzip -c largefile.gz | wc -l". I would like to ask
experts to point out what assumption I make. The source is attached below.

Thanks in advance,
Andy

#include <iostream>
#include <fstream>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/gzip.hpp>
int main(int argc, char** argv)
{
  char buf[10*4096];

    std::cout << argv[1] << '\n';
    long c=0;
    std::ifstream file(argv[1], std::ios_base::in | std::ios_base::binary);
    file.rdbuf()->pubsetbuf(buf, 10*4096);
    try {
        boost::iostreams::filtering_istream in;
        in.push(boost::iostreams::gzip_decompressor());
        in.push(file);
        for(std::string str; std::getline(in, str); )
            c++;

    }
    catch(const boost::iostreams::gzip_error& e) {
         std::cout << e.what() << '\n';
    }
  std::cout << c << '\n';
  return 0;
}



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net