Boost logo

Boost Users :

Subject: Re: [Boost-users] [Boost][Serialization] CPU bottleneck
From: Robert Ramey (ramey_at_[hidden])
Date: 2008-09-14 18:56:03


I've reviewed the profile and found it interesting.

Have you tried binary_?archive. You would find it much, much, faster in
this
this case for a variety of reasons.

To maintain portability of text files, the library has to manipulate each
character sent. This takes a lot of time and it adds up. You might
experiment
with creating a temporary array, wrapping in binary_obect and sending
it that way. But still, the very fastest will be to use binary_?archive.

Robert Ramey

Vjekoslav Brajkovic wrote:
> Hi,
>
> I am using the serialization library in my project and it has been
> functioning perfectly thus far. However, when I've scaled up the usage
> requirements, I had hit a very odd problem.
>
> Let me explain the use case. Serialization library is used in a DFS
> framework, handling large data structures (in terms of size, not
> complexity) such as std::vector<char> of size 1MB and above. The
> framework consists of two major components: Chunkserver (server) and a
> Client. Files are chunked, wrapped in a class, serialized and sent
> over the wire. Same things is done on the server side, but in a
> reverse order. The actual binary data is stored in a vector
> (previously, I've tried using string instead, but I had some issues
> with it and Robert suggested using some an alternative container).
>
> When I was depositing large files to Chunkserver, disk utilization was
> almost non-existent, whereas the CPU was maxed out. It is important to
> realize that this problem occurred only on the server side, not
> client.
> Upon further investigation using gprof I have concluded that the
> bottleneck was in the serialization library (it also may be the case
> that I am misusing it). According to the profiler, above 97% of the
> CPU time was spent in a singe function. Profiler results can be found
> at this address:
>
> http://www.cs.washington.edu/homes/balkan/gprof.txt
>
> For the reference, the signature of that function is:
> boost::serialization::serialize_adl<
> boost::archive::text_iarchive, std::vector...>
> and I am using text archive. I as mentioned before, this issue only
> occurs on a server side.
>
> I would appreciate if anybody could explain why this is happening and
> more importantly how to circumvent the issue.
>
> Thank you!
>
> Vjeko


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net