Boost logo

Boost :

From: Dave Abrahams (abrahams_at_[hidden])
Date: 2000-06-01 06:20:26


on 6/1/00 3:53 AM, Mark Rodgers at mark.rodgers_at_[hidden] wrote:

> One further thought - it would be nice if you could improve the
> efficiency of bulk inserts. I tried adding the following
> function to associative_vector:
>
> template<class Iterator>
> void insert_equal_x(Iterator first, Iterator last) {
> m_impl.insert(m_impl.end(), first, last);
> std::sort(m_impl.begin(), m_impl.end(),
> compose_f_gx_hy(m_key_compare, KeyOfValue(), KeyOfValue()));
> }
>
> I then used the following test function
>
> void test(int n, int m)
> {
> std::vector<int> initial(n);
> std::generate_n(initial.begin(), n, std::rand);

Aha, generate_n! I *knew* there was an algorithm for something like that,
but didn't find it when I was writing my tests.
 
> std::vector<int> add(m);
> std::generate_n(add.begin(), m, std::rand);
>
> avint av1;
> av1.insert_equal_x(initial.begin(), initial.end());
> avint av2;
> av2.insert_equal_x(initial.begin(), initial.end());
>
> std::cout << n << "," << m << std::endl;
> {
> boost::timer t;
> av1.insert_equal(add.begin(), add.end());
> std::cout << "\t" << t.elapsed() << std::endl;
> }
> {
> boost::timer t;
> av2.insert_equal_x(add.begin(), add.end());
> std::cout << "\t" << t.elapsed() << std::endl;
> }
> }

This test is a cool idea; thanks for trying it. But it doesn't seem quite
fair unless you implement the improvement suggested earlier for the existing
insert_equal and pre-reserve space in the vector equal to the number of new
elements. I'm sure other effects are being swamped by memory allocation.
 
> and measured the results for various values of n and m. The results
> were fascinating
>
> 1,10000
> 1.1
> 0.025
> 10,10000
> 0.65
> 0.025
> 100,10000
> 0.643
> 0.026
> 1000,10000
> 0.754
> 0.028
> 10000,10000
> 1.773
> 0.052
> 10000,1000
> 0.131
> 0.035
> 10000,100
> 0.016
> 0.058
> 10000,10
> 0.004
> 0.466
> 10000,1
> 0.004
> 2.517
>
> It suggests that there is probably some rule that could be used to
> switch between the two algorithms. Either
>
> 1. size() < N, or
> 2. distance(first, last) > M, or
> 3. distance(first, last) / size() > P.

Let's see:
inserting one element in a container of n elements should cost log(n)
comparisons. So inserting m elements costs:
log(n) + log(n+1) + log(n+2) +... log(n+m-1)

and I haven't done this for years, but that looks like:
  O(m*log(n+m))
Somebody correct me.

It wouldn't surprise me if, even after adding the appropriate reserve() to
insert_equal(), there was such a balancing point, since your version only
has to move each element once and because std::sort() is required to have
characteristics as good as quicksort (O(n log(n)) average-case). Where the
balancing point is, exactly, probably changes based on the cost of copying
elements. I'd expect the win from your approach to be much greater for
non-trivial elements. But we ought to be careful: an implementation of sort
is only required to have characteristics as good as quicksort, so sorting
could be n^2 worst-case. If we know we have an implementation which uses
introsort (e.g. SGI STL), it is always safe to make the optimization you
suggest.

Finally, another way to get optimization for large m would be to
(adaptively) pre-sort the input then repeatedly insert with a hint, which
would almost always be correct. But I think this is probably inferior to
just sorting the whole enchilada as you have suggested, because of element
movement issues.

> I would think that at least vector_multiset's ranged constructor
> could take advantage of this alternative.
>
> For insert_unique perhaps something along the lines of
>
> template<class Iterator>
> void insert_unique_x(Iterator first, Iterator last) {
> m_impl.insert(m_impl.end(), first, last);
> std::sort(m_impl.begin(), m_impl.end(),
> compose_f_gx_hy(m_key_compare,
> KeyOfValue(),
> KeyOfValue()));
>
> m_impl.erase(std::unique(m_impl.begin(),
> m_impl.end(),
> compose_f_gx_hy(m_key_compare,
> KeyOfValue(),
> KeyOfValue())),
> m_impl.end());
> }
>
> would be advantageous in certain circumstances.

That's a cool idea, too. But I think you'd need to use stable_sort() here.
Just because elements are equivalent under the comparison criterion doesn't
mean that they're identical ;)

SGI says:

Stable_sort is an adaptive algorithm: it attempts to allocate a temporary
memory buffer, and its run-time complexity depends on how much memory is
available. Worst-case behavior (if no auxiliary memory is available) is N
(log N)^2 comparisons, where N is last - first, and best case (if a large
enough auxiliary memory buffer is available) is N (log N). [2]

> Just a thought.

Several good thoughts! My primary interest for associative_vector & co. was
not to make insertions very efficient, but to improve on the data/code size
requirements of the standard associative containers while keeping the
efficient lookups. The constant factor for those containers on insertion is
probably so large, however, that trying to optimize associative_vector for
insertion speed is an excellent idea. Would you be willing to continue
pursuing the testing end of things? If we can come up with some conclusive
answers about where the tradeoffs lie, I will be glad to implement the
optimizations.

-Dave


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk