Boost logo

Boost Users :

From: Ovanes Markarian (om_boost_at_[hidden])
Date: 2007-04-30 14:05:37


On Mon, April 30, 2007 19:47, Manuel Jung wrote:
> Hi,
>
> I have to count a lot of words. Up to now i did it with MySQL, because it
> was easy. The result is safed there anyway. Now i thought i could speed up
> this a little if i would use internally a Multi_index list to store the
> words, so i have only to insert all different words. The words are stored
> in a UnicodeString from the ICU library.
> My code is really near to the one from the example "sequenced.cpp".
> Im using the following definition:
>
> typedef multi_index_container<
> UnicodeString,
> indexed_by<
> sequenced<>,
> ordered_non_unique<identity<UnicodeString> >
> >
>> text_container;
> typedef nth_index<text_container,1>::type ordered_text;
> text_container tc;
>
>
> Im inserting new words with "tc.push_back(UnicodeString(NewWord));"
> And count them exactly like in the example. I thought this should be fast,
> but it isnt. It eats up all my CPU, but isnt fast. It is a lot slower than
> my old solution.
> I have still hope i could speed this up, before i have to switch back MySQL.
> The profile of a run says
> that "boost::multi_index::safe_mode::check_same_owner<..." eats most CPU
> time.
>
> Some suggesting how to speed it up with Multi_index? Or some ideas which
> other way would be faster than MySQL inserts?
>
> Thanks
> Manu
>

Try using the hashed_non_unique instead of ordered_non_unique index implementation. This will use
hashed values to access keys, and not a comparison function.

My personal opinion is that if your words are in the database anyway, you should not retrieve them
from there and then store them. SQL Solution will be always faster, since databases knows how to
optimize statements and result sets as well.

With Kind Regards,

Ovanes Markarian


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net