Boost logo

Boost :

Subject: Re: [boost] [gil io_new review] Reading images from in-memory sources
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2010-12-06 10:38:15

Christian Henning wrote:
>> Can anyone help me to implement this:
>> void read_mem_jpeg( some_gil_image_tpye& dest, const char* jpeg_mem_begin,
>> const char* jpeg_mem_end) { ... }
>> in a way that doesn't copy all of the encoded data?
> Please correct if I'm wrong but what you're looking for a way to
> create a stringstream or similar that can be initialized with a
> character array containing an image?

..without copying all of the data, yes.

> std::stringstream allows you to do that. Like this:
> template< typename Image >
> void read_mem_jpeg( Image& dest
> , const char* jpeg_file
> , const char* jpeg_file_end
> )
> {
> istringstream ss( string( jpeg_file, jpeg_file_end ), std::ios::binary );
> read_image( ss, dest, jpeg_tag() );
> }
> Now, I don't know if the constructor of stringstream copies the data
> into another buffer. I'm not really an expert with c++ streams.

The string ctor copies all the data, and then the istringstream ctor
may or may not copy it all again (does anyone know?).

>> Looking at formats/jpeg/read.hpp, it seems that irrespective of that the
>> data will be copied again (into 'buffer') before being read by libjpeg.
> The 'buffer' only contains one scanline ( or row ) of your image. When
> reading an image with with this extension it usually is done a
> scanline by scanline manner.

Right. The data is all copied again.

(There are two issues with copying. The first the the extra memory
needed. In cases like this where you're copying line-by-line that's
not much of a concern. The second is the extra time taken. This is a
concern even when done line-by-line.)

>> On the subject of copying, while reading PNGs it looks like the data is
>> always copied into a temporary buffer in read_rows<>() even when no format
>> conversion is needed (or is there some specialisation that I have missed?).
>> ?And when reading JPEGs, it seems to always read single lines, yet libjeg
>> advises that you should get at least rec_outbuf_height lines on each call to
>> avoid extra copying within the library.
> Each format has their own reader class. One of the template parameters
> is the ConversionPolicy which can be either read_and_no_convert or
> read_and_convert. Each of these two classes have a member "read" which
> in the case of read_and_no_convert calls std::copy()

Right. It would be great to eliminate that copy.

> and for read_and_convert calls std::transform().
> When reading scanline by scanline the io extension has a buffer which
> holds one scanline only. The buffer is fed into the underlying library
> if there is one. In case no conversion is necessary I could have
> potentially use the user provided image as the buffer.
> About libjpeg, thanks for pointing out one potential inefficiency. As
> far as I can tell rec_outbuf_height is set to 1 be default. A user can
> change that number to 2 or 4 but the lib would need to be recompiled.
> The user also need to set the UPSAMPLE_MERGING_SUPPORTED compiler
> flag.

Well, here's what it says in jpeglib.h:

   /* Description of actual output image that will be returned to application.
    * These fields are computed by jpeg_start_decompress().
    * You can also use jpeg_calc_output_dimensions() to determine these values
    * in advance of calling jpeg_start_decompress().

   JDIMENSION output_width; /* scaled image width */
   JDIMENSION output_height; /* scaled image height */
   int out_color_components; /* # of color components in
out_color_space */
   int output_components; /* # of color components returned */
   /* output_components is 1 (a colormap index) when quantizing colors;
    * otherwise it equals out_color_components.
   int rec_outbuf_height; /* min recommended height of scanline
buffer */
   /* If the buffer passed to jpeg_read_scanlines() is less than this
many rows
    * high, space and time will be wasted due to unnecessary data copying.
    * Usually rec_outbuf_height will be 1 or 2, at most 4.

To me, that looks like the library sets the value not the user.

Regards, Phil.

Boost list run by bdawes at, gregod at, cpdaniel at, john at