Boost Users :

Date view	Thread view	Subject view	Author view

Subject: Re: [Boost-users] container with large number of elements and a small number of invalide elements
From: Jeff Flinn (Jeffrey.Flinn_at_[hidden])
Date: 2013-11-05 13:30:05

Next message: Vincent N. Virgilio: "Re: [Boost-users] container with large number of elements and a small number of invalide elements"
Previous message: MM: "Re: [Boost-users] container with large number of elements and a small number of invalide elements"
In reply to: MM: "Re: [Boost-users] container with large number of elements and a small number of invalide elements"

On 11/5/2013 12:43 PM, MM wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> MM
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Or modify your algorithm

What are you wanting

Jeff

Next message:
Previous
In reply

Date Thread Subject Author

http://www.boost.org/doc/libs/____1_55_0b1/libs/icl/doc/html/____boost_icl/projects.html#__boost___icl.projects.large___bitset <http://www.boost.org/doc/libs/__1_55_0b1/libs/icl/doc/html/__boost_icl/projects.html#boost___icl.projects.large_bitset> <http://www.boost.org/doc/__libs/1_55_0b1/libs/icl/doc/__html/boost_icl/projects.html#__boost_icl.projects.large___bitset <http://www.boost.org/doc/libs/1_55_0b1/libs/icl/doc/html/boost_icl/projects.html#boost_icl.projects.large_bitset>> I've used a similar approach to just store the "valid" bases in a dna sequences and keeping track of gaps with an interval container. The advantage is avoiding the filtering cost when the application is primarily interested in the valid items only. I would recommend checking out Google sparsehash -- specifically, sparse_hash_set<T>: http://code.google.com/p/__sparsehash/source/browse/__trunk/src/sparsehash/sparse___hash_set <http://code.google.com/p/sparsehash/source/browse/trunk/src/sparsehash/sparse_hash_set> The biggest difference between that and something like an std::vector<boost::optional<T>__> is google::sparse_hash_set will re-order values on you (the nature of hash tables). It also forces you to have a "deleted key" which exists in the realm of possible T values. Based on impression that I get from your original email, this fits with your idea of an "invalid" T. That said, it is a ridiculously low-overhead container (on the order of ~2 bits/entry vs 4 bytes/entry with a boost::optional). That's 2x10^6 bits for the OP's case, where-as the ICL approach is 64bits per interval. In the OP's case that would be 64x100(max number of invalid items assuming all invalid separated by atleast 1 valid item) about 3 orders of magnitude less. Jeff Thanks for the various answers. I'm reading about ICL, and sparsehash to learn more. basically, struct MyPOD { double d1; ..... double d8; } and I currently have std::vector<MyPOD> with 1e6 elements. 100 of those would be == a particular instance of MyPOD invalid { +Inf, +Inf, ... -Inf,, 0. }. It's a known, predefined MyPOD constant I need: 1. calculate the max element, and the min element, _while_ ignoring the 100 or so invalid entries (I don't know where they are), I can't however change the vector<MyPOD>. it's read only. 2. I need also to plot it (ie, plot all the elements that are not Invalid). The index of each element in vector is transformed and returns a sort of coordinate to plot a point So, I would have wrapped this vector into a class, and exposed similar interface to vector, but would have implemented an iterator that just skips the invallid entries:the iterator would skip the invalid entry until next valid one when incremented or decremented. My class would also be indexable with [] with this access not with the same order as vector's [] (it would skip the invalid entries, so slower) Perhaps the best choice is a boost range adapted view. Reading the docs.... thanks very much I have wrapped my vector in a class, and used boost range adaptor filtered. I've implemented operator[](std::size_t i) as a simple begin is the begin iterator to a filtered view on myvector | boost::adapators::filtered( ... ) with a predicate that ignores my invalid values. std::advance(begin, i) /// this advance on the view gives a performance equal to what a forward only iterator (not random access iterator) would give. This appears to be too slow in my case, I need a better order of magnitude of such access, I probably should have a solution where I store the indices of the invalid values, and I implement operator[](std::size_t i) with the help of those indices. to not require random access. ;-) to do with the sequence(s)? Vincent N. Virgilio: "Re: [Boost-users] container with large number of elements and a small number of invalide elements" message: MM: "Re: [Boost-users] container with large number of elements and a small number of invalide elements" to: MM: "Re: [Boost-users] container with large number of elements and a small number of invalide elements" class="links"> view view view view org/mailman/listinfo.cgi/boost-users">Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net