Boost logo

Boost :

Subject: Re: [boost] [system][filesystem v3]Question about error_codearguments
From: Domagoj Saric (dsaritz_at_[hidden])
Date: 2009-11-01 17:33:22


"Detlef Vollmann" <dv_at_[hidden]> wrote in message
news:4AE99A0F.3070809_at_vollmann.ch...
> Stewart, Robert wrote:
>>>From a debugging perspective, the exception will be thrown at the wrong
>>>point.
> I'm not so sure about this argument.
> What you would probably throw is not the original exception,
> but an exception objects of some type like 'unhandled_error'
> that has inside the original exception data.

hmm...this actually points out that i pretty much messed up in my first
post/did not properly define neither the solution nor the problem it was trying
to solve (actually mixed two solutions for two problems)...

:: one problem is the problem of /unchecked return error/status codes/...which
could, i think, be solved quite well with the proposed error_code object that
asserts that it was inspected...
[ throwing an exception in that case is "out of the question" "in my
book"...that is by definition a programmer error...and those are checked with
asserts...(although i wouldn't mind leaving a configuration option for those
closer to the paranoid/defensive/corporate software world that would place a
throw after the assert for NDEBUG builds)...ironically my first code snippet
did just that...threw in case of an uninspected object even if no error
actually occured ]

:: the other problem is the reporting and handling of errors and "exceptional
situations":
there seems to be three schools here:
 - "do not use exceptions at all", which i think is unanimously regarded as
flawed here so there's no need to waste time on it
- "error codes are not-by-the book, outdated, uncool, unholy historical
artefacts, always use exceptions"
 the boost::error_code documentation provides some good arguments against such
a "fundamentalist" policy - that you sometimes actually need to know the result
at the site of the call which would force you to use try-catch blocks making
your code actually uglier/less readable than if error/status codes were used,
in addition to your code being less efficient (both in terms of space and
time)...
...i would expand on that by saying that it is not always possible to know
which functionality falls into which category...
...for example, as you mentioned, the functions that return codes with
semantics similar to EAGAIN would probably always fall into the category of
"best reported, inspected and handled with error codes" category...but other
situations might not be so clear cut, e.g. file IO or some special GUI
operation failure would in most situations probably constitute unexpected (or
"unhandlebale at the call site") situation thus falling into the "best
reported, inspected and handled with exceptions" category, but not always: some
already pointed out cases of temporary, cached or networked files...all in all
most issues mentioned deal with system io calls...but imo it would be usefull
if we could find a solution that would work "across the board"...i.e. an error
reporting mechanism that could be used not only by low level system and io
libraries but also by gui libraries, xml parsers, string, algorithm and
conversion libraries and so on...that would thus bring the choice of whether to
use exceptions or error codes to the widest possible audience in the widest
possible context/number of applicable situations...it would no longer be the
worry of the library writer which mechanism to use to report errors...the user
would decide for him/herself at the very place of the call!

for example if one uses lexical_cast<> deep in the bowels of some parser it
would probably be "natural" to use a bad_cast exception to report an error to
the outside world but if one uses lexical_cast<> to convert a string entered
into a widget by a user into an int it would mostly be simpler to have a simple
error code if the user entered an invalid string and issue a warning and retry
"at the face of the place"...

...this brings us to the second part of the "equation"..efficiency...
the lexical_cast<> example is perhaps bad because the current implementation
uses std::streams and is thus incredibly inefficient...
{ on msvc 9.0 sp1 the single line boost::lexical_cast<int>( "321" ); caused 14
calls to new and 26 calls to EnterCriticalSection() (not to mention vtable
initializations, virtual function calls, usage of locales...) the
first time it is called, and 3 calls to new and 15 calls to
EnterCriticalSection() on subsequent calls...and caused the binary to increase
by 50 kB! ...which is imnho abhorrent... some perspective:
http://kk.kema.at/files/kkrieger-beta.zip
http://www.youtube.com/watch?v=3vzcMdkvPPg }
so the overhead of one more exception and a try-catch block becomes not as
relevant in comparison (and it can also throw bad_alloc, but this is a
specific issue i will tackle later) but it is also a good example for the same
reason because it points out where you can end up if you forgo efficiency
alltogether and simply pile inefficient code upon inefficient code...
...usually i get to read either that exceptions are (or can be) free in terms
of cpu cycles or that exceptions are "extremely expensive" (with numbers that
show how expensive it is to throw and catch exceptions)...
...afaik the first assertion is nonsense...you only need to use your
dissasembly window to, for example, inspect generated code for a function that
has local objects with non trivial destructors and a call to function that can
throw and compare that to the generated code for the same function but with
a call to a function that cannot throw (and that is obvious to the compiler)...
to see how much bookkeeping the compiler must do only because of the presence
of
exceptions/just the possibility that a throw might happen...
the second remark, otoh, is irrelevant...nobody really cares how much it costs
to throw an exception (if you are using them anyway), it should be an
"exceptional" situation therefor happening rarely...the overhead of exceptions
that concerns me comes from the overhead imposed by their very
presence/possibility...(this can be tested by, for a test, building a big c++
library w/ and w/o exceptions, i've seen things like 4MB DLLs dropping to
3MB...)

another example...i'm writting a small utility...that will open one file, one
simple dialog and a few registry keys (or another os equivalent)...i want to be
able to do this by handling errors using error codes...because if i'm forced to
use exceptions then i must also link to the full crt which makes my util go
from like 30kB to like 130kb...

sure...i know that a lot of people will now say...so what...who cares about
efficiency (atleast in this context or on this scale)...but "i answer that"
with a reminder that we are talking about C++ here...which, afaik, "officially"
inherited the ">you do not pay for what you do not use<" paradigm from C...and
it seems to me that this "foundational imperative" is somehow lost today with
"big name books" giving precedence to 'paranoid'/'defensive'/'safe'/'secure'
and similar buzzwords (while labeling apriori efficiency concerns "the root of
all evil")...thus giving c++ more and more a recognizable 'managed' smack...if
you want ('forced upon you' type of) 'secure' and 'managed' go .net or java or
... there's plenty of 'managed' out there...but there is only one c++...

perhaps i can make use of linus's well known irrational diatribe on c++...
{ http://thread.gmane.org/gmane.comp.version-control.git/57643/focus=57918 or
http://lwn.net/Articles/249460 }
...to make my point. while it truly is a nonsensical diatribe it does, imho,
make one good point (in the sense that, my, experience has showed this to be
true at times):
"Quite frankly, even if the choice of C were to do *nothing* but keep the C++
programmers out,
that in itself would be a huge reason to use C."
...which is later explained with an example along the lines that c++
programmers are taught/encouraged/frequently seen to write code like
std::string astring = "my" + std::string( "cool" ) + "string";
without ever thinking how many memory allocations this causes...
...this could also be reversed to say that we should use c++ if for nothing
else than just to eliminate the typical bad/unsafe/unreadable/bug prone C-style
code...but this would be to miss the point (which is not to argue with linus)
but to say that maybe the c++ "imperative" to first and formost promote the
writing of 'correct' code should be reformulated to (again) include
efficiency...in the sense that the label 'correct' (code) also includes
'efficient' (code)...

...this is of course a slightly black-and-white way of looking at things and
priorities have to be made "in the real world"...but to require that, atleast,
the base line libraries (like the standard library and boost) do not make
hardcoded/non configurable efficiency vs. "something else" compromises and
tradeoffs is imho a justified whish/goal...

anyways, if "we are now convinced" that neither of the "fundamentalist"
approaches are (sufficiently) good we have to look at the third
alternative/"school" and that is the hybrid approach...

but how to go on about it..?
...the method proposed by boost::error_code, that you pass a reference to an
error code object to a function based on which it decides whether to throw or
store an error in the object has the:
- advantage that it would work well with functions that otherwise return
something else (like a valid result), i.e. it does not 'take up the space' of
the return value
- disadvantage that you have to predeclare an error_code variable if you do not
want to use the default behaviour
- disadvantage that you can forget to inspect your own error_code object (if
you used it, as in the point above)
- disadvantage that from the standpoint of the compiler the function will still
have to be marked as possibly "throwable" so all the eh machinery would still
have to be inserted into callers "no matter what" (unless it's a simple(r)
function subject to inlining or transparent to the link time code generator)
- advantage that the throw call is within the function (so the code is
'allocated' only once and the point of the throw is closer to the actual place
and condition that caused the error)

...the method that would use throw and nothrow overloads has the:
- advantage (in comparison to the above) that the compiler could recognize
throw and no throw functions
- disadvantage of a possible "maintenance nightmare"

...the method that would use "smart error objects" has the:
- advantage of ensuring that errors are checked both if exceptions or error
codes are used
- advantage of a cleaner/less ugly interface (there is no error_code &
parameter added to every function, and the need to do C-style predeclarations
of error_code variables)
- advantage that the library writer does not have to concern him/herself with
the error reporting mechanism and the user has full/finer-grained control
- disadvantage that it does not work well or at all with functions that would
otherwise return something else (the iteartor-bool pair soultion with the
std::map<>::insert() function comes to mind, but this seems way to complicated
and ugly for return types not as trivial as an iterator)
- disadvantage that callers might become a little "fatter" because of the
"smarter"/"fatter" error code objects (but this, as previously mentioned, would
vary significantly between use cases...i can even imagine cases where it would
actually be beneficial...e.g. if many functions report the same type of error,
then, even if exceptions are used mostly, this would/could result in smaller
binaries because the code for throwing that error would exist in only one
place, the error code class, instead of in every such function)
- "debugger perspective" disadvantage (which, as argued before, i still do not
actually consider a real issue)

neither of the three methods solve the problem of functions that would
throw/return different types of errors/exceptions...

this specific issues could, perhaps, be used to make the descision as to which
functions should use the hybrid error_code/exceptions (or just error codes)
approach and when to switch to the 'standard' approach that uses exceptions...:
when a function deals with only one domain (like io, parsing, regex, gui,
resource management...) it maybe sensible that its error codes likewise be
grouped into one domain...so that an io routine does not throw five different
exception objects for five different error conditions that may arise in an io
operation but use a single io_error/exception object that can be queried for an
enum value that describes that actual/specific error...
...this way the problem of different exception types would be solved for
single/specific domain functions...
...for functions that this soultion cannot be applied to, that can for example
throw both an io, regex and a memory exception, we could simply say that these
functions are high level api's for which the exception mechanism "naturally"
fits (and could perhaps in certain complex situations even be more efficient
than hordes of if then else statements) and just exceptions should be used...
(sure things like variants could be used but this would probably be way to
cumbersome if not plain ugly and unuseable...and would probably have no chance
of becoming part of the standard)

...unfortunately this still leaves us with functions that are actually
single-domain but can additionally fail due to dynamic memory allocation usage
(throw bad_alloc)...which are probably frequent enough to constitute a separate
problem...
...like the lexical_cast<> example (ok lexical_cast<>'s use of dynamic memory
allocation and general inefficiency is imo so severe that it should be
considered a bug and fixed but it still serves the purpose of the example)...we
cannot say, ok let it still throw bad_alloc but report its
domain-specific-error using error_codes because that would defeat the whole
purpose of this proposal (for one thing, the compiler would still rightly
consider the function as throwable)...
...the only thing that comes to mind currently is to consider "E_OUT_OF_MEMORY"
a "part" of every domain...
...i can also envision mpl generated do_throw() functions that use steven
watanabe's switch_ library to automatically convert error_codes to (differently
typed) exceptions...(but this again probably has no chance of every becoming a
part of the standard)...

-- 
 "That men do not learn very much from the lessons of history is the most
important of all the lessons of history."
 Aldous Huxley

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk