Subject: Re: [boost] Reforming Boost.System and <system_error> round 2
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2018-01-16 13:11:13
On 01/16/18 00:18, Niall Douglas via Boost wrote:
> That leaves the request to fix "if(ec) ..." which right now returns true
> if the value is 0, despite that much code writes "if(ec) ..." to mean
> "if error then ...". There is also the issue of error coding schemes not
> being able to have more than one success value, which usually must be 0.
> That's the remaining discussion point.
> Question: is this still considered too much overhead?
I think the test you presented is rather optimistic in that it is
comprised of a single translation unit. I think, in real applications
the following are more common:
- The error category is often implemented in a separate translation unit
from the code that sets or tests for error codes with that category.
This follows from the existing practice of declaring the category
instance as a function-local static, where the function is defined in a
- The code that sets the error code is often in a separate TU than the
code that tests for errors. This follows from the typical separation
between a library and its users.
Given the above, unless LTO is used, I think the compiler will most
often not be able to optimize the virtual function call.
I've converted your code to a synthetic benchmark, consisting of one
header and two translation units (one with the test itself and the other
one that defines the error category). The test still does not isolate
the code of producing the error code from the code that analyzes it, so
in that regard it is still a bit optimistic.
I'm using gcc 7.2 and compiling the code with -O3. Here are the results
on my Sandy Bridge CPU:
Experimental test: 275565 usec, 362890788.017346 tests per second
std test: 45767 usec, 2184980444.425023 tests per second
This is a 6x difference.
In the generated code I noticed that the compiler generated a check of
whether the virtual function `failure` is actually
`experimental::error_category::failure`. If it is, the code uses an
inlined version of this function (otherwise, the actual indirect call is
performed). So if you comment `code_category_impl::failure` the test
succeeds and the indirect call is avoided. Here are the results for this
Experimental test: 71711 usec, 1394486201.559036 tests per second
std test: 48177 usec, 2075679266.039812 tests per second
This is still a 1.5x difference.
Now, I admit that this synthetic benchmark solely focuses on the one
check for the error code value. The real applications will likely have
much more code intervening the tests for error codes. The real world
effect will be less pronounced. Still, I thought some estimate of the
performance penalty would be useful, if only to show that this is not a
zero overhead change.
Do I think this overhead is significant enough? Difficult to tell.
Certainly I'm not happy about it, but I could probably live with the
1.5x overhead. However, it still results in code bloat and there is no
guarantee this optimization will be performed by the compiler (or that
it will be effective if e.g. my code always overrides
`error_category::failure`). Thing is, every bit of overhead makes me
more and more likely consider dropping `error_code` in favor of direct
use of error codes. `error_code` already carries additional baggage of
the pointer to error category.