Boost :

Date view	Thread view	Subject view	Author view

From: Matthias Troyer (troyer_at_[hidden])
Date: 2005-11-13 03:52:28

Next message: Matthias Troyer: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Previous message: Matthias Troyer: "Re: [boost] [serialization] fast array serialization (10x speedup)"
In reply to: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Next in thread: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Reply: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Reply: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"

On Nov 12, 2005, at 9:33 PM, Robert Ramey wrote:

> Now if someone feels differently and wants to implement such a
> thing, they
> have my full support. There is no need to modify the core library
> and no
> benefit - performance or otherwise. The following shows how to go
> about
> this.
>
> For purposes of this exposition, I am going to limit myself to how
> one would
> go about crafting a system similar to the one submitted. That is,
> I will
> not address concerns such as binary portability as the are not
> addressed
> in the submission as I see it. I'm assuming that the only issue is
> how
> best to arrange things so that save_binary/load_binary are invoked in
> for contiguous collections of fundamental types.
>
> Suggestions
> ===========
> I do see the utility and merit in what you're trying to do here -
> finally.
> Honestly it just wasn't obvious from the initial package. So here
> is how
> I would have gone about it.
>

> [snip - look at original mail for the full proposal]

> So net result is:
>
> a) save_binary optimizations are invoked from the
> fast_oarchive_impl class.
> They only have to be specified once even though they are used in more
> than one variation of the binary archive. That is, if there are N
> types to be subjected to the treatment by M archives - there are
> only N overrides - regardless of the size of M.

Indeed this reduces an NxM problem into a 2*N problem: serialzation
of all classes that can profit from this mechanism needs to written
twice, Better than M times, but still worse than doing it once. There
are more fundamental problem though that I will come to later.

>
> b) save_binary optimizations can be overriden for any particular
> archive
> types.
> (It's not clear to me how the current submission would address such a
> situation).

Actually the problem is reverse. In my proposal, the save_array
function of the archive can decide how to treat each type, while your
proposal dispatches everything to save_binary.

> c) There is no need alter the current library.
>
> d) It doesn't require that anything in the current library be
> conditioned on
> what
> kind of archive is being used. Insertion of such a coupling would be
> extremely unfortunate and create a lot of future maintainence work.
> This would be extremely unfortunate for such a special purpose
> library. This
> is especially true since its absolutly unnecessary.

This coupling can be removed in my proposal just by moving the
serialization of arrays out of i/o serializer.hpp and into a separate
header. A coupling between archive types and serialization of arrays
will be necessary at some point, and encapsulating this in a single
small header file is probably the best.
>
> e) The implemenation above, could easily be improved to be resolved
> totally
> at compile time. Built with a high quality compiler (with the
> appropriate
> optimization switches set), this would result in fastest possible
> code.

Same as my proposal.

> f) all code for save_binary is in one place - within
> fast_oarchive_impl. If
> fast_oarchive_impl is implemented as a template, it could be
> applied to
> any existing archive class - even text and xml. I don't know if there
> would be any interest in doing that - but it's not inconcievable.
> Note
> also that including all the save_binary optimizations in for all of
> std::vector
> ublas::vector
> ublas::matrix
> mtl::vector
> blitz::array
> custom_lib::fast_matrix
> doesn't require that the headers for these libraries be included. The
> code in the header isn't required until the template is instantiated.
> So there wouldn't be any "header bloat"(tm)

This is where the real problem is hidden, and I will explain it below
when you explain the alternatives.
>
> g) Now, f above could also be seen as a disadvantage. That is, it
> might seem better to let each one involved in serialization of
> a particular collection keep his stuff separate. There are a
> couple of options here I will sketch out.
>
> i) one could make a new trait is_bitwise_serializable whose default
> value is false. For each collection type one would specialize this
> like:
>
> template<class T>
> struct is_bitwise_serializable<vector <T> > {
> ... is_fundamental<T>
> ... get_size(){ // override default options which is sizeof(T)
> ..
> }
>
>
> Now fast_oarchive_impl would contain something like:
>
> // here's a way to do it for all vectors in one shot
> template<class T>
> void save_override(const T & t, int){
> // if T is NOT bitwise serializable - insert mpl magic
> required here
> // foward call to base class
> this->Base::save_override(t, 0);
> // else -
> *(this->This()) << make_nvp("count", t.size() * sizeof(T));
> *(this->This()) << make_nvp(make_binary_object(...get_size
> (), &
> t));
> // note - the nvp wrappers probably not necessary of
> we're only
> // only going to apply this to binary archives.
> }
>
> Which would implement the save_binary optimization for all types with
> the the is_bitwise_serializable trait set. Of course any class
> derived from fast_oarchive_impl could override this as before.

There is one serious and fundamental flaw here: whether or not a
certain type can be serialized more efficiently as an array depends
not only on the type, but also on the archive. Hence we need a trait
taking BOTH the archive and the type, one like the has_fast_array
serialization that I proposed.

> ii) another option would be to implement differing serializations
> depending upon the archive type. So that we might have
>
> template<class T>
> void save(fast_oarchive_impl &ar, const std::vector<T> & t, const
> unsigned
> int){
> // if T is a fundamental type or ....
> ar << t.size();
> ar.save_binary(t.size() * sizeof(T), t.data?());
> }
>
> This would basically much simpler substitute for the
> "fast_archive_trait"
> proposed by the submission.

Now we are back to an NxM problem.

But the real issue is that for many array, vector or matrix types
this approach is not feasible, since serialization there needs to be
intrusive. Thus, I cannot just reimplement it inside the archive, but
the library author of these classes needs to implement serialization.
Hence, your approach will not work for MTL matrices, Blitz arrays and
other data types.

Matthias

Next message: Matthias Troyer: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Previous message: Matthias Troyer: "Re: [boost] [serialization] fast array serialization (10x speedup)"
In reply to: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Next in thread: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Reply: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"
Reply: Robert Ramey: "Re: [boost] [serialization] fast array serialization (10x speedup)"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk