Boost logo

Boost :

Subject: Re: [boost] [system] Would it be possible to trial a breaking change to Boost.System and see what happens?
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2018-01-13 13:09:19


On 01/13/18 02:58, Niall Douglas via Boost wrote:
>>> 1. Stop treating code zero as always meaning success irrespective of
>>> category. This remedies a defect where some custom error code domains
>>> cannot be represented by error_code due to using value 0 for an error.
>>> In actual code, I've never seen anyone ever use comparison to anything
>>> but a default constructed error code, so I think this will be safe.
>>
>> In my code I'm using contextual conversion to bool extensively, e.g. "if
>> (err)" or "if (!err)". This is currently equivalent to comparing against
>> 0, so I assume it will be broken by this change. So, you can count me
>> right now as the one (loudly) complaining, among many others, I think.
>
> Sorry, I forgot to mention that the category would be asked whether the
> error code's current value is success or failure (its's Friday, and my
> current contract is many hours drive from home. Sorry)
>
> This is unavoidable because system_category (on POSIX and Win32) and
> generic_category both define value = 0 as success. In some other error
> code domain, -1 might mean success. Or 9999. Up to the domain. No
> requirement that 0 be set aside as special, as currently.

I appreciate that the special meaning of 0 may be undesirable, but
having a virtual function call overhead whenever one wants to test if
the error_code has a success value seems a too high price for that. I
don't want to be paying that price in my Linux programs, sorry.

>>> 2. Success becomes solely the default constructed error code, which is
>>> code zero and a new category of "null_category". This is internally
>>> represented by { 0, nullptr }, and thus makes the default constructor
>>> trivial which is highly desirable as it eliminates any compiler magic
>>> static fencing for the default constructor. Right now the default
>>> constructor uses system_category with value 0, and I suspect no code
>>> will notice this change either.
>>
>> This breaks "category()" as it returns a reference now.
>
> It would still return a reference, just as now. You would check for
> internal nullptr, if so return a static null_category.

Why check in the first place if you could initialize the pointer to the
global instance of the category?

>> Also, "message()" has to check for a null category now.
>
> I don't think that much cost relative to constructing a std::string.

True, but it looks like an unnecessary cost nevertheless. (Note that a
small string construction is very cheap and can fit in a few
instructions, including the string contents initialization.)

>>> 3, Make the default constructor constexpr, now possible. No code should
>>> notice a difference, except less runtime code will be generated.
>>
>> Not sure there will be any difference in the generated code, since the
>> constructor still has to initialize the int and the pointer. Optimizers
>> are already able to propagate constants even if the code is not constexpr.
>
> But it cannot elide the potential call to an extern function, the
> function which retrieves the error_category&. And therefore must emit
> code, just in case the system_category might be fetched.

I don't think the current standard requires an external function call.
The default constructor can obtain the pointer to the global category
instance directly, if it is exported. `system_category()` can be an
inline function as well.

> The present design of error_code generates code bloat when used by
> Outcome or Expected. Changing it as I describe stops forcing the
> compiler to emit code, and thus allows the compiler to fold more code
> during optimisation, thus emitting less assembler. Tighter, less bloaty
> code is the result.

I assume, you mean the code that calls `system_category()` in runtime? I
think, this is a quesion of QoI that can be solved as I described above
without the need to break users' code or adding runtime overhead.

>>> 4. error_code::message() returns a std::string_view instead of
>>> std::string if running C++ 17 or better. This lets us remove the
>>> <string> dependency, and thus stop dragging in half the STL with
>>> <system_error> which in turn makes <system_error> actually useful to the
>>> embedded systems crowd. The guess here is that std::string_view has such
>>> excellent interoperation with std::string that I think 99% of code will
>>> compile and work perfectly without anybody noticing a thing.
>>
>> This will only work if "error_category::message()" returns a string from
>> a static storage. It will not allow relying on strerror_r or similar
>> API, for example. This will make migration problematic if users define
>> their own error categories that rely on external API like that.
>
> You are correct that the no 4 change is by far the most consequential
> proposed. I do not believe it affects end user code except for those who
> inherit from error_code and override message() - surely a very, very
> tiny number of people - but it sure as hell breaks everybody who
> implements a custom error code category if you change the category's
> message() signature.

I don't have an estimate on how widespread custom error categories are,
but I surely have a few. I know that several Boost libraries define
their categories (namely, Boost.Filesystem, Boost.ASIO, Boost.Beast,
Boost.Thread come to mind). Searching on GitHub[1] finds a few examples
of custom categories (with a lot of noise from the comments though).
Judging by that, I wouldn't say that custom categories are rare.

[1]:
https://github.com/search?l=C%2B%2B&q=error_category&type=Code&utf8=%E2%9C%93

> The number of people worldwide currently implementing their own error
> categories is likely low, and me personally speaking feel that their
> mild inconvenience is worth losing <string> from the header
> dependencies. But if backwards compatibility were felt to be super
> important, in error_code::message() you could try dynamic_cast to an
> error_category2 first in order to utilise an
> error_category2::message_view() function. If the dynamic_cast fails, you
> statically cache the strings returned by error_category::message() and
> return string_views of them.

The `dynamic_cast` is yet more expensive than a null check, and it may
be more expensive than constructing a small `std::string`. It may also
be problematic on the embedded systems, which often have RTTI disabled.
Caching the string in a static storage means thread safety issues, which
will probably require TLS. This is a lot of complication and runtime
overhead just to remove one #include.

> This is an API breaking change being discussed. Code _might_ break
> beyond custom error categories.
>
> But I currently believe that code other than custom error categories
> _won't_ break. We are changing the API in ways that I would be very,
> very surprised if much real world code out there even notices.

Code that relies on a specific error category in the default-constructed
`error_code` will break.

> Finally, Boost has broken API much more severely than this in the past
> as part of natural evolution (not always intentionally). If it's
> preannounced that we're doing this, and we do it, I don't see a problem
> here. Plenty of old APIs have been retired before. And besides, this is
> for testing a proposed change to a future C++ standard. That's a great
> reason to break API, if you're going to do it.

I don't think I remember an intentional major breakage of API without a
deprecation period and a clear migration path. Such breakages are
typically introduced as new libraries that live alongside with the old ones.

Some parts of your proposal (namely, switching to `string_view`) do not
have a clear migration path, especially for C++03 users. Other parts
seem to be targeted at a rather niche use case but have high associated
overhead that is going to be applied to all uses of `error_code`. And,
on top of that, users' code may break, including silently. Sorry, but to
me this doesn't look like a safe change that can be done to a widely
used core library. Boost.System2, that will live side by side with the
current Boost.System, seems more appropriate. I realize that you won't
know how much of the world will be affected by the proposed changes.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk