Boost logo

Boost Users :

From: Joaquín Mª López Muñoz (joaquin_at_[hidden])
Date: 2006-03-13 11:22:35


bert hubert ha escrito:

> On Sun, Mar 12, 2006 at 04:52:34PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
>
> > So, the prunning code goes like:
> >
> > /* 1 */
> > for(j=ttdindex.begin();j!=ttdindex.end();){
> > // leave to-be-pruned elements at the beginning
> > // of ttdindex
> > }
> >
> > /* 2 */
> > ttdindex.erase(ttdindex.begin(), j);
>
> I've timed it separately, /* 1 */ is 'walk took', /* 2 */ is 'erase took'.
> It appears both are more or less related, but that /* 2 */ dominates the
> stats.
>
> The number of items deleted is looked - partial, where partial is the number
> of records that were partially pruned and begat a higher ttd because of
> the operation. So /* 1 */ is more than leaving elements behind, it also
> kicks some elements upwards.
>
> The list below gets interesting near the end.

[...]

Certainly, your figures show an unexpected rise in execution time
over linear when the number of looked up elements grow. I've
done some local tests with a similar data structure (one hashed index
plus one ordered index) and erase() behaves, as it should, as O(n);
there's no amortization in erase(): basically, erase(it0,it1) erases
the range elements one by one. The /*1*/ part should perform
as O(n log n), the log n part due to the usage of replace(), still
way below quadratic.

I'd check the following for anomalies:

1. Could you (just for tests), replace index #0, which now is hashed,
with an ordered_unique index and rerun? A quadratic behavior could
be the result of a very bad hashing scheme: this is extremely unlikely
for your current code, but who knows.

2. Is the size of CacheEntry::d_records roughly the same between
prunes? Could it be that the average number of records in CacheEntry
elements is disproportionately larger on those prunes that take
longer?

3. What percentage of elements of d_cache are visited on each
prune? I.e. what is the typical value of looked/d_cache.size()? The
nonvisited elements shoudn't have any impact on phase /*2*/,
but they have a (minimal) impact on those operations of /*1*/
performing as O(n log n), such as replace(). Anyway, no amount
on nonvisited elements can have such a huge impact on figures.

Looking fwd to your feedback,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net