Boost logo

Boost :

From: Lubomir Bourdev (lbourdev_at_[hidden])
Date: 2006-10-31 20:47:01


 
Stefan Heinzmann wrote:
>
> Maybe it's just me but I find extending GIL to support
> something like the v210 Quicktime format quite challenging (I
> don't want to imply that this is GIL's fault). This is a
> 10-bit YUV 4:2:2 format which stores 6 pixels in 16 bytes. It
> appears to me as if trying to support it would touch on a lot
> of concepts and corners of GIL, as it would require a new
> pixel storage format, color space, component subsampling, and
> maybe more.
>
> I believe it would help understanding if you could try to
> give at least a road map of what needs doing to support this
> properly (a fully coded example would probably require quite
> some effort).
>

Stefan,

This is an excellent example for a very complicated image format.
Here is a link that I found that describes it:

http://developer.apple.com/quicktime/icefloe/dispatch019.html#v210

Basically, each 16 bytes contain 6 packed Y'CbCr pixels, each channel of
which is 10-bits long. Some of the channels are shared between different
pixels.

Here is a rough plan of how I would approach modeling this in GIL:

1. Provide yCrCb color space
2. Provide a model of sub-byte channel reference whose offset can be
specified at run time
3. Create a custom pixel iterator to handle v120 format

__________________________
Detail:

1. Provide yCrCb color space (see design guide for detail):

struct ycrcb_t {
    typedef ycrcb_t base;
    BOOST_STATIC_CONSTANT(int, num_channels=3);
};

This defines the common typedefs for pixels, iterators, locators,
images, etc:

GIL_DEFINE_ALL_TYPEDEFS(8, ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(8s, ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(16, ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(16s,ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(32f,ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(32s,ycrcb)

2. Create a model of a sub-byte channel reference, whose offset is a
dynamic parameter.
This is almost identical to class packed_channel_reference from the
packed_pixel example:

template <typename DataValue, typename ChannelValue,
          int FirstBit, int NumBits, bool Mutable>
class packed_channel_reference;

Except that FirstBit is passed at run time and stored inside of it:

template <typename DataValue, typename ChannelValue,
          int NumBits, bool Mutable>
class packed_runtime_channel_reference {
    ...
    const int _first_bit;
};

We now have a model of the 10-bit channel:

typedef packed_runtime_channel_reference<uint32_t, uint16_t, 10, true>
    v120_channel_ref_t;

We can use it to define a model of a pixel reference. We can reuse
pixel_ref, which is a class that models PixelConcept whose channels are
at disjoint places in memory:

typedef planar_ref<v120_channel_ref, ycrcb_t> v120_pixel_ref_t;

3. Create a custom pixel iterator, containing a pointer to the first
byte in 16-byte block and index to the current pixel in the block:

// Models PixelIteratorConcept
struct v120_pixel_ptr : public boost::iterator_facade<...> {
    uint32_t* p; // pointer to the first byte of a 16-byte chunk
    int index; // which pixel is it currently on? (0..5)

    typedef v120_pixel_ref_t reference;
    typedef ycrcb16_pixel_t value_type;

    void increment();
    reference dereference() const;
};

Its increment will bump up the index of the pixel, and if it reaches 6,
will move the pointer to the next 16 bytes:

void v120_pixel_ptr::increment() {
   if (++index==6) {
      index=0;
      p+=4;
   }
}

Its dereference will return a reference to the appropriate channels.
For example, the fourth pixel uses:
 For Y': bits [22..31] of the 3rd word
 For Cb: bits [2 ..11] of the 2nd word
 For Cr: bits [12..21] of the 3rd word

reference v120_pixel_ptr::dereference() const {
   switch (index) {
      ...
      case 4: return reference(
              v120_channel_ref_t(*(p+3),22),
              v120_channel_ref_t(*(p+2),2),
              v120_channel_ref_t(*(p+3),12));
      ...
   }
}

You can now construct a view from the iterator:

typedef type_from_x_iterator<v120_pixel_ptr>::view_t v120_view_t;

And you should be able to construct it with common GIL functions:

v120_view_t v120_view=interleaved_view(width, height, ptr, row_bytes);

You should be able to use this view in algorithms:

copy_pixels(v120_view1, v120_view2);

Note that it is only compatible with other v120 views. So you cannot
copy to/from a regular view, even if it is Y'CbCr type. To do that you
will have to write channel conversion and color conversion. Use the
packed_pixel.hpp example to see how to do that.
Once you do that you should be able to do:

copy_and_convert_pixels(v120_view, rgb8_view);
copy_and_convert_pixels(rgb8_view, v120_view);

or:

jpeg_write_view("out.jpg",
    color_converted_view<rgb8_pixel_t>(v120_view,
v120_color_converter));

You should be able to run carefully designed generic algorithms directly
on native v120 data.

Lubomir


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk