Boost logo

Boost :

Subject: Re: [boost] [function] function wrapping with no exceptionsafetyguarantee
From: Daniel Walker (daniel.j.walker_at_[hidden])
Date: 2010-10-30 12:43:51


On Fri, Oct 29, 2010 at 4:13 PM, Jeff Flinn
<TriumphSprint2000_at_[hidden]> wrote:
> Daniel Walker wrote:
>>
>> On Thu, Oct 21, 2010 at 2:25 AM, Doug Gregor <doug.gregor_at_[hidden]>
>> wrote:
>>>
>>> On Wed, Oct 20, 2010 at 1:51 PM, Daniel Walker
>>> <daniel.j.walker_at_[hidden]> wrote:
>>>>
>>>> So,
>>>> adding "empty" vtable objects would increase the space requirements of
>>>> boost::function; each template instantiation would need two
>>>> corresponding vtable objects, one for actual targets and one as a
>>>> fallback that throws bad_function_call. (The "empty" vtable objects
>>>> could be stored statically, but static storage is also a very precious
>>>> resource on many platforms.) Not all users would appreciate this
>>>> trade-off; i.e. an increase in the space overhead for a small decrease
>>>> in time overhead (with compiler optimization the time savings are
>>>> minuscule, in my experience).
>>>
>>> Without quantifying the trade-off, we don't know which way is better.
>>>
>>> In any case, this issue is fairly easy to settle; it just takes
>>> effort. Someone implements this alternative scheme and determines the
>>> cost/benefit in space and time, and with any luck the choice is
>>> obvious.
>>
>> OK, I implemented the alternative scheme and ran some benchmarks to
>> determine the cost/benefit in space/time. The code is in a Trac ticket
>> with a patch that allows boost::function to be configured to represent
>> its empty state using a static object that contains an "empty"
>> function which calls boost::throw_exception when invoked. The user may
>> select the alternative scheme by defining the macro
>> BOOST_FUNCTION_USE_STATIC_EMPTY.
>>
>> https://svn.boost.org/trac/boost/ticket/4803
>>
>> I also attached a tarball to the ticket with a Jamfile and source code
>> to compile and run benchmarks of a function pointer, boost::function,
>> and boost::function using the static empty scheme. The time overhead
>> per call and space overhead per object are measured by the executables
>> at runtime. The space overhead per type is the size of the static
>> initialized date section in the executable's data segment as reported
>> by /usr/bin/size.
>>
>> The following results were obtained on a x86 machine running an Unix
>> variant using the manufacture's build of gcc 4.2. The machine was not
>> in a labratory environment, so I am not controlling for changes in
>> operating tempature that could impact performance at these time
>> scales. The following tables present the raw numbers from bjam's
>> release and debug builds. (See the Jamfile for the complete arguments
>> to bjam.)
>>
>> Data (Release):
>>           | funcion ptr |  function   | function (static empty)
>> time/call  |  6.28e-10s  |  6.68e-09s  |  6.40e-09s
>> space/type |     32B     |    48B      |    64B
>> space/obj  |     8B      |    32B      |    32B
>>
>> Result (Release): Defining BOOST_FUNCTION_USE_STATIC_EMPTY yields a 4%
>> decrease in time overhead per call but doubles the space overhead per
>> type. (On msvc 10 the decrease in time overhead is closer to 10%.)
>>
>> Data (Debug):
>>           | funcion ptr |  function   | function (static empty)
>> time/call  |  6.33e-09s  |  2.62e-08s  |  2.26e-08s
>> space/type |     32B     |    48B      |    64B
>> space/obj  |     8B      |    32B      |    32B
>>
>> Result (Debug): Defining BOOST_FUNCTION_USE_STATIC_EMPTY yields a 18%
>> decrease in time overhead per call but doubles the space overhead per
>> type.
>>
>> So, I think the current boost::function implementation is certainly
>> the right default, since many users would not appreciate doubling the
>> static space overhead for a time savings of less than 10% per call.
>> However, I think it is a good idea to offer users the opertunity to
>> tinker and experiment with this trade-off, so that they can choose
>> what works best for their application.
>
> I've followed this thread only peripherally, but the important info to me
> would be does removing the check for empty function affect the optimized
> performance/code-size for the non-empty scenarios? In my experience, these
> are the scenarios most often encountered.

Yes, if you look at the code in the benchmark, you will see that it is
measuring the cost of a call to a non-empty boost::function. In
optimized object code, the call is 4% faster without the check, but
removing the check means that it is necessary to store a special,
internal static object per instantiation to hold an "empty" function
that must be available if boost::function becomes empty. This static
object doubles boost::function's space overhead in the initialized
static data section of the executable's data segment.

Daniel Walker


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk