Boost logo

Boost Users :

Subject: Re: [Boost-users] container with large number of elements and a small number of invalide elements
From: Jeff Flinn (Jeffrey.Flinn_at_[hidden])
Date: 2013-11-01 15:18:11


On 11/1/2013 1:28 PM, MM wrote:
> On 1 November 2013 12:37, Jeff Flinn <Jeffrey.Flinn_at_[hidden]
> <mailto:Jeffrey.Flinn_at_[hidden]>> wrote:
> On 10/31/2013 1:14 PM, MM wrote:
> On 31 October 2013 16:10, Klaim - Joël Lamotte
> <mjklaim_at_[hidden] <mailto:mjklaim_at_[hidden]>
> <mailto:mjklaim_at_[hidden] <mailto:mjklaim_at_[hidden]>>> wrote:
>
> What would "valid"/"invalid" means here?
> If it's "exist" or "don't exist", then maybe using
> boost::optional<MyPod> would be enough.
>
> That's right. How about sizeof( boost::optional<MyPod> ) vs sizeof(
> MyPOD ) ?
> Also, what sort of iterator can iterate over just the valid
> values? or
> how to implement one?
> The point here is that there's a small number of elements that are
> invalid, their indices in the container are random,
>
> So why not keep the invalid items from getting into the container in
> the first place?
>
> Jeff
>
> because they are processed at a later stage, their indices in the
> container are relevant,

I'd need a better description of just what defines and item to be
valid/invalid. And just what this processed at a later stage means.

If you've some functions:

   bool valid(const MyPod& ) { return ...; }
   bool invalid(const MyPod& pod) { return !valid(pod); }

, and a pre-existing container then you can get an adapted view using
boost::range::adaptors::filtered like:

typedef std::vector<MyPod> MyPods;

MyPods all(...);
MyPods bad;

boost::copy(all | filtered(invalid), std::back_inserter(bad));

boost::foreach(all | filtered(valid), someFunction);

It all depends on the context of where this is used. Another approach
uses the boost icl (interval container library). One example is:

http://www.boost.org/doc/libs/1_55_0b1/libs/icl/doc/html/boost_icl/projects.html#boost_icl.projects.large_bitset

I've used a similar approach to just store the "valid" bases in a dna
sequences and keeping track of gaps with an interval container. The
advantage is avoiding the filtering cost when the application is
primarily interested in the valid items only.

Jeff


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net