Yes yes yes, is it on yahoo ?


At 18:56 7/10/2003 -0400, you wrote:

Is there any interest in a class implementing compression algorithm for transition sparse bitsets. As an example of the algorithm's utility let's say we want to create a uniform minute by minute record of weekly attendance applicable to all of the existing web users worldwide. The user's record thus conceived must cover all, let's say 10,000,000 of the existing web sites and will be 60 * 24 * 7 * 10,000,000 or 100,800,000,000 bits long. A simple observation that makes this proposition feasible, though, is that first, rather than storing a given user's bitset attendance record, what is sufficient for us to store instead is the user's logon and logoff times for the websites the user attended which in turn are represented by the bitset's transitions and second the number of transitions in the bitset is subject to the user's physical ability to attend only that many Web sites and no more. Thus the bitset's transition count, which in the example at hand may run in hundreds at most, is small relative to the bitset's mega size. The fast proliferating sources of information, be it the Internet or in any other mass medium, combined with the user's ability to absorb but a tiny part of the information available plus the necessity to research and monitor global flow of information, conditions that are not likely to change, makes transition sparse bitsets arguably the most important and ubiquitous bitsets in the existence. The ts_bitset class stores a bitset of variable length as a std::vector of the bitset transitions, implements the relevant part of the std::bitset functionality and extends it to the subintervals of the bitset. For example std::bitset::count() function has ts_bitset::count(pos1,pos2) counterpart which returns in O(n) time, n being the bitset's transition count, and std::bitset::any() function has a ts_bitset::any(pos1, pos2) counterpart which returns in Log(n) time. The ts_bitset class also implements operators &=, |=, and ^= that act on two equal size bitsets and which, along with ts_bitset member functions like reverse () or flip (), were not speed wise feasible for std::bitset implementation. All of the ts_bitset member functions return in O(n) time (O(m + n) time when 2 bitsets are involved) or faster.

-----------------------------------------------------------------------------------
Jonathan de Halleux, Research Assistant
Center for Systems Engineering and Applied Mechanics (CESAME)
Universite catholique de Louvain
Batiment Euler , Av. Georges Lemaitre, 4 Tel : +32-10-47 2595
B-1348 Louvain-la-Neuve Belgium
E-mail : dehalleux@auto.ucl.ac.be
-----------------------------------------------------------------------------------