I compiled an application using ublas with mapped_matrix<double>.  I am running an optimized build (with –DNDEBUG), and I am finding very poor performance on mapped_matrix.

Running gprof shows that more than 50% of the execution time is spent in
    std::_Rb_tree<unsigned long, std::pair<unsigned long const, double>, std::_Select1st<std::pair<unsigned long const, double> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, double> > >::lower_bound(unsigned long const&) const

More than 30% of the execution time is spent in:

And 6% of the execution time is spent in mapped_matrix::find2

This application runs a lot of matvec operations, so that should be the bottleneck.  I have just migrated from another linear algebra package that runs the same problem orders of magnitude faster.

I suppose that the obvious question is if there is a more efficient way to do sparse matrix operations?

Thanks for any advice,