Boost logo

Boost :

From: Karl Nelson (kenelson_at_[hidden])
Date: 2000-12-04 18:17:30


> In message <200012042113.NAA01465_at_[hidden]>, Karl Nelson
> <kenelson_at_[hidden]> writes
> [good summary]
> > - sharing (fast copy, smaller run size)
> > vs
> > - cloning (slow copy, large run size, more types)
>
> Could you clarify the size issue? Do you mean sizeof footprint, memory
> allocated, ...?

Sure,

There are three sizes to a program and the design of the system
affects all 3.
  
  - static binary size (how much stuff gets compiled into the executable)
  - runtime binary size (how much actually gets loaded)
  - runtime data size (how big are the objects created at run time.)

Of the things in my last summary, here are some the affects

  sharing (reference counting) - reduces runtime data size if many copies
    are used, but increases if things can't be shared as you need extra
    data (counters) This system also potentially increases binary size
    as the counter dealing code may get scattered everywhere. This
    competes against the copy code which we avoided.

  cloning - every copy increases runtime data size, thus if the user copies
    functors frequently, they will get loads of runtime data. Further,
    each of those copied will implement lots of operator=() so you get
    runtime and static binaries which are larger. However, if copies
    are used infrequently, this system can be much smaller on all three.
    This is really the pay as you go system, because you don't get the
    overhead unless the user starts doing expensive things.

Clearly depending on the frequency of the copying you can get
vastly different results.

Ie, sharing is vastly better for

  vector<Functor<void> > v;
  Functor<void> f=myfunctor(0);
  for (int i=0;i<1000;i++)
     v.push_back(f);

While cloning likely favors...

  vector<Functor<void> > v;
  for (int i=0;i<1000;i++)
     v.push_back(myfunctor(i));
   

Other factors that affect it...
  virtual templates - Use of a virtual in a template class greatly increases
    the number of type nodes, which increases static data size. Unless
    these nodes are uses, they may not actually get loaded thus the
    runtime binary may not reflect these extra data nodes. Start
    dynamic casting in the functors and this gob of stuff will get loaded.

  factored templates - The factored design pulls the templatized data out
    of the vtable as the cost of an extra struct and function pointers. This
    increase runtime data size and marginally runtime binary size as you
    must set up these hand vtables yourself. However, it removes all
    the vtables and other crud which C++ generates for classes and thus
    the size of the static binary drops dramatically. For reference,
    sigc++ code has been 40% smaller in terms of compiled binaries size
    compared to the boost systems I have down loaded. (This is the
    stripped size. If we start talking about unstriped sigc++ beats
    by orders of magnitude, 1M to 40k.)

  non-library - All of the stuff discussed thus far is contained in a
    header, thus all the stuff must potentially get compiled into the
    binaries produced. If you are using sharing this means reimplementing
    the same functions over and over either by inlining or separate functions.
    (Is it really necessary to have smart_ptr<Functor<void> >::reference()
     and smart_ptr<Functor<void,int> >::reference() be separate compiled
    functions?) You can cut some of this down with really good partial
    specialization.

  library - in this case all the stuff that can be factored and isn't
    necessary to be inlined gets stuck in the library. The library
    reduces the static binary sizes but increase the runtime binary
    size (unless the library gets shared).
  
Sigc++ is optimized for GUI use. Safety over speed, static binary size
over runtime data size. This meant heavy factorization and data
sharing. Strong attention to lifespan issues. It is a library
which has factored out the most common code.

Sharing verse Cloning
---------------------
Since, boost is set on the idea of some how making a simple system
and push off dealing with lifespan something I believe I showed was
not possible without heavy compromising the designs, the real
consideration here is how often are things going to be copied
verses how often they will be called. Sharing saves copying and the
expense of execution. However, as functors are infrequently copied
(as you can pass by reference) and called frequently, thus unless
copying is prohibitively expensive cloning is best.

So what drives up the cost of copying enough to justify sharing?
Lifespan adds a hidden set of dependencies for which the user
can't lock. This means many accesses to shared data and thus great
expense in a copy. Second, a very factored system has a high cloning
cost. Because much of the type info has been factored out of the system,
the operator= can't be implemented, the result is that the system
needs to factor the operator = out themselves. This doubles the cost
of the factorization and thus makes cloning infeasible.

Your systems discussed don't have lifespan which is the primary reason
for considering cloning, and further there is little discussion of
using heavy factorization, cloning is really the only choice.

Hope it adds some light to the current debate.

--Karl


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk