|
Boost : |
From: Robert Ramey (ramey_at_[hidden])
Date: 2007-02-10 02:47:06
Our experiments with 1.3 revealed that the single biggest time consumer with
the binary archive was with the stream i/o. version 1.34 - is
re-implemented in terms of std::streambuf rather than stream i/o. It is
significantly faster because of this. If someone has a situation where he
performance is a concern I would recommed increasing the buffer size to a
larger amount - on the order of 1 MB. This can be done using library
facilities. That is, serialization depends upon standard streambuf i/o and
if performance is a big consideration, one should configure the standard
streambuf accordingly.
In cases like your test case, there is still the opportunity to specialize
the binary archives to handle these cases faster. Indeed its quite simple
using the facility of the library to implement special handlers for these
types which can benefit from it. So a realistic test/comparison would have
to consider that this is what a real user would want to do when confronted
with this kind of situation.
However, this was deemed - not good enough - in some quarters. So
specializations were added to the default 1.35 version of the binary
archives to speed up exactly these special cases - large collections of
primitives.
Up shot is that, for large collections of primitves like chars and ints, our
test show that the current version in the head - 1.35 will be approximately
10 times faster than 1.33 you have used as a basis for comparison. ( Hmmm -
actually we used arrays and vectors so we don't have a complete comparison.)
Anyway, I would guess that the current version of the library and anyother
recently created serialization library would be comparible.
So I don't think this is a big issue. Actually, I never thought it was a
big issue since large collections of primitives are not the most common
application of the library and the library could (and can) accomdate special
cases with the facilities built into the library itself. What I
think/thought about this didn't really matter though, as the library was
easily extended by the interested parties to accomodate those who wanted to
invest the effort.
Robert Ramey
brass goowy wrote:
> Shalom
>
> I've been comparing results from Boost Serialization (B.Ser) and
> Ebenezer Enterprises(EE) on Windows XP lately. I've compared
> saving a
> 1. set<int>,
> 2. list<int>, and
> 3. list<int> and deque<int>.
>
> I'm using MSVC8.0, Boost 1.33.1, and software from www.webEbenezer.net
> to build the tests. I use clock() statements to measure the amount
> of time used. I've read on this list that there is an issue with
> using clock() on Windows, but I use it the same way in all the tests
> so doubt it is an issue here.
> I use a buffer of 4096 bytes in the EE versions and from what I can
> tell the Boost versions also use the same size of buffer. (I'm not
> doing anything to set the buffer size with Boost. It seems to
> default to 4096.) Each of the containers is filled with 1,000,000
> ints. Below are a few lines from one of the Boost tests.
>
> ofstream ofs("myfile");
> binary_oarchive oa(ofs);
>
> clock_t start(clock());
> oa << lst;
> clock_t end(clock());
> cout << "That took " << end - start << "\n";
>
>
> Build times/Exe sizes
> In each of the tests the B.Ser versions take longer to build and the
> executables are more than two times bigger in bytes than the EE
> versions.
>
> Run times
> I ran the B.Ser and EE versions 3 times in a row and threw out the
> fastest and slowest times and kept the remaining middle time.
> The following results are from optimized (O2) versions of the tests.
>
> set<int>
> B.Ser ----- 1630
> EE --------- 451
> In this test the B.Ser version takes 3.6 times longer than the
> EE version.
>
> list<int>
> B.Ser ----- 1440
> EE --------- 271
> B.Ser takes 5.3 times longer here.
>
> list<int> and deque<int>
> B.Ser ----- 2894
> EE --------- 521
> B.Ser takes 5.5 times longer here.
>
> I've only done a few tests without optimization. The results from
> those tests have had higher ratios than those listed above. For
> example, the non-optimized B.Ser version of the list<int> test
> is about 8 times slower than the non-optimized EE version.
> One thing that sticks out in my mind is that the optimized B.Ser
> version of the list<int> test is 3 times slower than the non-optimized
> EE version.
>
> These results are similar to what we observed on Linux previously.
> http://lists.boost.org/Archives/boost/2005/11/96497.php
>
> I didn't test exactly the same thing in the Windows tests and the
> Linux tests. Feedback from the Linux tests indicated some objection
> to commenting out a generated call to flush the buffer we use.
> I didn't comment out any of the generated code in these Windows
> tests like I did with Linux. And so the Windows tests fill the
> buffer and flush it numerous times.
>
> Regards,
> Brian Wood
> Ebenezer Enterprises
> www.webEbenezer.net
> _______________________________________
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk