Boost logo

Boost Users :

From: bert hubert (bert.hubert_at_[hidden])
Date: 2006-03-16 16:43:06


The problem of quadratically slowing erase from boost::multi_index has been
resolved to a FreeBSD specific allocator problem, which is as yet unsolved.

I'm trying to make this problem easy to reproduce but haven't managed to do
so yet.

Kind regards,

bert

On Mon, Mar 13, 2006 at 10:43:34PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
> ----- Mensaje original -----
> De: bert hubert <bert.hubert_at_[hidden]>
> Fecha: Lunes, Marzo 13, 2006 8:32 pm
> Asunto: Re: [Boost-users] SOLVED! Re: multi_index 1.33.0,known
> problems withRHEL 3 compiler (gcc 3.2.3)?
>
> > On Mon, Mar 13, 2006 at 05:22:35PM +0100, Joaqu?n M? L?pez Mu?oz
> > wrote:
> > > I'd check the following for anomalies:
> >
> > Joaquin, I'm busy tonight with other things so I'm only able to
> > give you
> > some approximate answers, for which I apologise. I hope to be able
> > to give
> > more definitive answers Wednesday.
>
> Of course, come back when you can.
>
> >
> > > 1. Could you (just for tests), replace index #0, which now is
> > hashed,> with an ordered_unique index and rerun? A quadratic
> > behavior could
> > > be the result of a very bad hashing scheme: this is extremely
> > unlikely> for your current code, but who knows.
> >
> > The original container had ordered_unique as its first index and
> > showed a
> > similar (large) increase in pruning time when pruning more cache
> > entries. I
> > did not look for a power law or anything.
>
> OK, let's rule out hashing problems, then.
>
> >
> > > 2. Is the size of CacheEntry::d_records roughly the same between
> > > prunes? Could it be that the average number of records in
> CacheEntry
> > > elements is disproportionately larger on those prunes that take
> > > longer?
> >
> > The cache remains almost identical in size, around 50 million
> > entries. The
> > cache represents DNS records expiring, each run expires all DNS
> > entries that
> > have a 'time to die' ttd that is larger than the current time.
>
> I'm not referring here to d_entry.size(), but to the
> average x.d_records.size() for x in d_entry. How many
> records does a cache entry have on the average? Does this
> experience large variations?
>
> >
> > This holds, I think, a vital clue to the problem at hand. Records are
> > inserted with ever increasing ttds, generally. There is a number
> > of common
> > time-to-live numbers for DNS records, ie, 5, 60, 300, 3600, 7200,
> > 86400seconds.
> >
> > This means that we have a number of 'striding' ttd's being
> > inserted: now +
> > 5, now + 60, now +300 .. now +86400. Might this trigger a badly
> > balancedtree?
>
> It shouldn't, otherwise it would be a bug in Boost.MultiIndex.
> Not that I categorically deny this possibility, but I tend
> to think the problem might lie elsewhere. If we're left with
> nothing else, we could arrange a test for balancedness
> easily enough, please tell me so when you're ready
> to instrument that.
>
> >
> > On a related note, I wonder if chopping off the lower half of any
> > sort of
> > ordered tree might trigger worst case behaviour. Note that
> > CacheEntry::getTTD() which forms one index might be slow to call.
>
> The erase() part is not affected, as it doesn't invoke the
> comparison predicate nor the key extractor (getTTD in your case.)
>
> >
> > > 3. What percentage of elements of d_cache are visited on each
> > > prune? I.e. what is the typical value of looked/d_cache.size()? The
> > > nonvisited elements shoudn't have any impact on phase /*2*/,
> > > but they have a (minimal) impact on those operations of /*1*/
> > > performing as O(n log n), such as replace(). Anyway, no amount
> > > on nonvisited elements can have such a huge impact on figures.
> >
> > Typically 500 elements are visited of a 50 million element
> > structure, so
> > 1e-5. Things get crazy once > 1000 elements are visited.
>
> Doesn't seem like large variatons on d_entry.size() are to
> be accounted for the problem, then.
>
> >
> > Thanks Joaquin, I hope to provide more measurements soon.
> >
>
> This problem is intriguing me... specially the fact that
> you observe quadratic behavior in a linearly-decomposable
> (and hence O(n)-behaved) problem. I think the following,
> easy test, can shed some light: to state facts, what you
> get is reasonably fast pruning when pruning period
> (and thus looked value) is low, with an explosion
> when pruning period reaches a given threshold, around
> say looked=800, right? And this happens even (quote)
> "without inserting something in the container in between",
> right? Why don't you try then the following: when
> pruning time is above a threshold, make doPrune
> do the pruning in two stages (and time each
> separately):
>
> 1. up to now/2.
> 2. up to now.
>
> According to the stated facts, making doPrune in two
> stages is about equivalent to making it in one fell
> swoop, wich implies the summed time won't change
> much. On the other hand, it is also equivalent to
> having a halved, below-threshold pruning period, thus
> implying contradictorily that the summed time will be
> much less than with one stage pruning. Why don't you do
> the test? Interesting things might come up...
>
> Joaqu?n M L?pez Mu?oz
> Telef?nica, Investigaci?n y Desarrollo
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>
>
> !DSPAM:4415e83f181571719714943!
>

-- 
http://www.PowerDNS.com      Open source, database driven DNS Software 
http://netherlabs.nl              Open and Closed source services

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net