|
Boost : |
From: joel de guzman (isis-tech_at_[hidden])
Date: 2001-06-21 10:30:39
Hello,
More on sparse bit vectors...
[snip]
[Joel]
> > -> Macintosh style graphical regions. A vector of sparse bit
> > vectors can represent something like Regions in the Mac.
> > in fact the Mac-Region inspired this implementation. [ I
> > am aware that this is patented. I don't know if my
> > implementation infringes on this patent. But I am also
> > aware that this patent will expire soon (now)? ]
[Joel]
I've always used Mac-Regions as sets before. I sorely miss it
when doing cross-platform programming.
[Dave]
I think the Mac implementation uses a some kind of difference encoding from
row-to-row, so that, e.g., a gray halftone (checkerboard) is actually very
cheap to encode. If you're not doing that, you probably don't infringe.
[Joel]
I did some Web search:
<quotes>
The region patent probably expires sometime between 1998 and 2000, but
I have a feeling that Apple won't be too unhappy with you for using the
info to write Mac programs before then.
-- Gavriel State | Lead Developer - CorelDRAW! for Macintosh It is patented, I've read the patent. I don't recall the patent # (this was several years ago), but if you find it you can order a copy from the US Patent office. So it is not secret. Look in IM QuickDraw, the patent number should be there somewhere. The format is rle, but with a couple of strange twists. There are two cases: 1) Trivial case: rectangle represented by a size of 10 and the rectangle in the Region.rbox field. 2) Complex case, size > 10. The data is a series of shorts in the following format y x x x 7fff y x x x 7fff 7fff where y is the current scanline, x is the horizontal offset and 0x7fff is the end of line/ end of data marker. What's tricky here, (and the key to the patent), is that x does not simply invert the current line, but rather x inverts everything down and to the right of the point. This is working from memory so there may be a few differences in the details. Check a couple of samples against this (overlapping rectangles is a good one to try). -- Jerry Whitnell First Virtual Corp jerry_at_[hidden] </quotes> [Joel] I guess Dave is right. My implementation does not infringe on the patent. I'm unsure why they chose an end marker instead of a length prefix. How could they do per-row bsearch for hit testing with this format? Also, all data is placed in a contiguous memory block. Hit testing requires linear pass over the Y lines! Well at least Apple's regions are totally ADTs. The only visible part is the region bbox and memory size. Might they have changed the representation by now? [Greg] > There are other ways to do sparse bit vectors. My favorite > is a very compact binary tree representation that uses on > average two bits for every "on" bit and no bits for the > "off" bits. It's not patented so far as I know. If it is > I know of plenty of prior art. [Joel] There are pros and cons for both techniques. The range-run technique has very nice properties especially suitable for many real world applications. I think the reason is because it is very natural to construct sets from ranges. And this particular data structure representation excels when defining ranges. A set of all unicode Japanese characters (Kanji, Katakana and Hiragana) for example connot be trivially represented with the binary tree scheme that you mentioned. This will occupy lots of memory and consequently have a slower insertion and search time. I am not sure about Kanji*, but Katakana and Hiragana occupy a single contiguous range. I'd say, at the most, there are not more than 5 ranges in there (correct me if I'm wrong, I don't have the Unicode charts with me right now). (*Japanese Kanji and the Chinese characters overlap) OTOH, the binary tree representation excels when the data is randomly distributed such that there are lots of discontinuities. Would it be possible to have both and dynamically switch to an algorithm/data structure based on the resulting size/complexity? Hmmm. Cheers, Joel de Guzman PS> I am also very much interested with floating point range- runs. Here, a step is 1 epsilon unit instead of 1 for integers. This I guess is not possible (or at least portable) with b-tree representations. PPS> I wonder why I can't find data structures and algorithms like this in books? None of my algorithm books have anything close. Or am I just plain ignorant again? Perhaps~~~
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk