Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2008-02-09 18:40:04


My review of the proposed logging library follows.

Summary: I was disappointed by the amount of code needed to get the
most basic logging working. The use of macros to declare loggers seems
like a bad design choice. The absence of support for syslog was
surprising. I encountered many compiler warnings. The documentation
was unsatisfactory. Many of the features that I would have liked to
see were missing. Despite all this, John seems to be enthusiastic and
responsive: there is a real chance that he will be able to fix many of
these problems.

Detail:

In order to evaluate the library I decided to apply it to Decimail, an
IMAP mail server that I wrote. It currently does some fairly basic
logging of the IMAP conversation, enabled by a flag in its
configuration, to syslog.

Most of the programs that I've written do some sort of logging.
Typically it's something as simple as:

#include <iostreams>
cerr << "starting to frobnicate\n";

or

#include <syslog.h>
syslog(LOG_INFO,"starting to frobnicate");

or

#include <assert.h>
assert(frob!=-1);

These techniques are all very simple to use, and I feel that a logging
library needs to be equally straightforward for basic operations:

#include <boost/logging.hpp>
log << "starting to frobnicate";

Of course it needs to offer more features than that - but the most
basic use needs to be simple. (If it's not that simple, then potential
users will instead just write to cerr as above; as their needs slowly
grow they'll add their own ad-hoc logging functions; it will then be
too late to take the leap to a logging library.)

So I started by reading the main_intro.html and getting_started.html
docs. I eventually got to "Example 1", which described how I had to
add two new files to my project (21 lines) and call an initialisation
function from main(). I could then log by changing my code as follows:

     if (Configuration::singleton().get_bool("imapd_log_imap",false)) {
       //syslog(LOG_MAIL|LOG_DEBUG,"{%d} S: * %s",num,msg.c_str());
       L_ << "{" << num << "} S: * " << msg;
     }

Having fixed one error in the example (g_l_filter vs. g_log_filter) and
copious warnings I was able to see log output on the console (twice?)
and in a file. This was all a lot more effort than I was expecting,
but it did seem to work.

I found the use of macros in example 1 unsatisfactory. Specifically,
the logger "object" which I declare in the header and to which I send
my log messages (L_ in the code above) is actually a macro, not an
object. This means that I have to handle it with great care. For
example, say I currently have:

void do_something(ostream& log_to, .......) {
   log_to << "starting to do something";
   ...
}

and I want to convert this to use this library. I write:

void do_something(logger& log_to, .......) {
   log_to << "starting to do something";
   ...
}

This fails - and probably with an incomprehesible error message.

The motivation for the macro is sensible: it evaluates to something like

   if (l.level > current_level) l

so that time-consuming functions in the log line are not evaluated if
they will not actually be stored. But I would much prefer to see this
optimisation made more explicit, so that the user can tell what's going
on and avoid it if they want or need to:

   logger l;
   l << "message"; // wasteful if logging is disabled, but do I care -
   foo(l); // - because this works as expected.
   LOG_IF_ENABLED(l) << expensive_function(); // if I do care, I can
invoke a macro explicitly
#define ll LOG_IF_ENABLED(l) // or if I want conciseness I can
_choose_ to do this
   ll << expensive_function();

Or I could do this, without a macro at all, if I were the sort of
person who felt that all macros were evil:

   if (l.enabled()) l << expensive_function();

I think that what's practically needed to make this work is to bundle
together the current filter and logger objects into a single _object_,
rather than bundling them together using a macro.

Next I tried to use the library's "filter" mechanism in place of the
if-condition you can see above. I looked at "Step 1: Filtering the
message" at workflow.html, expecting to find a filter that I could
initialise with my condition (as a value, or by reference to a
variable, or passing a functor or function-pointer). But it seems that
all of the provided filter classes are things to provide various sorts
of thread-safety. Writing my own is not difficult, though it might be
more conventional to use a functor (i.e. operator()()) rather than a
class with an is_enabled member [this would allow you to use things
like lambda expressions, I think]. So I wrote:

struct log_imap_filter {
   bool is_enabled() const {
     return Configuration::singleton().get_bool("imapd_log_imap",false);
   }
};

Using this involved more macros:

// log.hh:
typedef log_imap_filter filter_type;
BOOST_DECLARE_LOG_FILTER(g_log_filter, filter_type)

// log.cc:
BOOST_DEFINE_LOG_FILTER(g_log_filter, filter_type )

I don't know why these need to be macros. It seems to obfuscate what
is going on to me.

The last thing that I needed to replace my syslog call was to specify
the log level, which I noticed in workflow.html under "Step 2A:
Gathering the message". (The fact that I couldn't find any description
of this fairly central functionality anywhere else, e.g. in any of the
pages in the tutorial section, seems typical of the documentation).
Based on that example, I tried:

   L_(dbg) << "happily debugging";

which failed to compile. I haven't worked out what I should have
written. I did try it with a namespace prefix.

The final thing that I looked at in detail was scoped logging. When
I'm writing more detailed logging code (e.g. for debugging) I generally
make it scoped, i.e. reporting interesting function entries and
normally exits. I like to indent each scope by a few characters in the
log output, e.g.

void foo() {
   cerr << "foo starting\n";
   for(int i=0; i<5; ++i) {
     blah(i);
   }
   cerr << "foo done.\n";
}

void blah(int i) {
   cerr << " blah " << i << "\n"; // note indentation
   ...
}

I read the scoped_logs.html page, and it seems that the
BOOST_SCOPED_LOG[_CTX] macro deals with the starting/ending business.
Compared to my explicit start/done output above, it has the feature
that the done message will be generated if the function returns using
an explicit return statement or if it throws an exception. This may be
an advantage or a disadvantage: with my code, the absence of the done
message would suggest that an exception might have occurred. Ideally
the end-of-scope log message would indicate whether it was invoked
during exception handling; could that be done using compiler-magic?

But why does it have to be a macro? And there doesn't seem to be any
indentation or other way of matching up the start and end messages. If
they were matched up, then a log file viewer tool could provide
collapse/expand tree view functionality (does Purify still do something
like that?).

I was also surprised that no use was made of __LINE__, __FUNCTION__
etc. in the scoped logging or other macros.

I also don't see any way to couple the scoped logging features with
enabling and disabling logging. Specifically, I might detect that
something of interest is happening and want to increase the log level
within that scope:

void foo() {
   scoped_log sl(log);
   for (int i=0; i<100; ++i) {
     scoped_enabled_logging_if s(log, i==42);
     blah(i); // logging enabled in here only if i is 42
   }
}

I didn't look in much detail at the "destinations" functionality, but
I'd like to note some of the features that syslog offers for files that
it writes to:

- Flushing of files after each message is optional. Not flushing helps
performance, but it might mean that critical messages are lost before a crash.
- Enormous numbers of identical messages within a short period are
compressed to a single line, "last message repeated 193247 times".
- Periodic "MARK" lines can be generated to show that the application
is still running.

Other features that I've seen in applications that do their own logging
to files include:

- Creating new files for each sub-process or thread, by appending a
thread identifier to the filename.
- Truncating individual messages to some limit (e.g. to avoid logging a
rogue char* that points to a huge block of stuff with a \0 a megabyte
later), or individual fields within the log message.

In my quick review I didn't see any of that functionality.

The rolling_file feature looks useful for long-running applications
where it's necessary to move the log files (e.g. to compress them)
without interrupting the process. I suggest referring to the logrotate
utility for how this sort of thing is often handled on Unix servers;
for example, it can be configured to send a signal to a process to tell
it to close and re-open its log file with the same filename.

Finally I'd like to look at how the user can control the logging. I
don't see any built-in functionality for this. I have a couple of
suggestions: environment variables and command-line options.

An environment variable could be used as follows. The code would do
something like this:

     g_l()->writer().write(::getenv("MYAPP_LOG_FMT"), ::getenv("MYAPP_LOG_DEST"));

The programmer's choice of format and destination may or may not
override the user's choice via the variables. The user could then
control the format and destination when invoking the program:

$ export MYAPP_LOG_FMT='%time%($hh:$mm.$ss.$mili) [%idx%] |\n'
$ export MYAPP_LOG_DEST='file(debug.log)'
$ myapp

Firefox has something a bit like that IIRC.

For command-line integration, I suggest providing an initialisation
function that processes argc and argv looking for debug options, and
strips them:

int main(int argc, char* argv[]) {
   log.init(argc,argv); // modifies argc and argc
   // following code doesn't see debug options
   ...
}

DirectFB has something like that. Again you could use this to define
formats and destinations.

You could also give the user control over scoped filtering via a
command-line interface. For example:

$ myapp --log:default=none --log:parser=debug --log:scheduler=info

The names 'parser' and 'scheduler' would match scoped log-enable
blocks. I could imagine hierarchical names too. I've used one
commercial C++ application that has this sort of logging (you can
indicate component, log level and destination), and it proved useful.

Conclusion:

* What is your evaluation of the design?

Breaks up the problem into a sensible set of components. Basic usage
requires too much effort. Lacks many features that I would find
useful. Pervasive use of macros is unappealing.

* What is your evaluation of the implementation?

Seems to work. Many compiler warnings do not inspire confidence in
code quality. Have not reviewed the code in any detail.

* What is your evaluation of the documentation?

Poor. But in part, that's because it is documenting the considerable
effort required to get basic functionality working.

* What is your evaluation of the potential usefulness of the library?

Everyone needs logging. If they do it using this library, they'll have
to write a lot of code themselves to implement their own filters,
destinations and so on. In practice they won't bother and will write
their own ad-hoc code.

* Did you try to use the library? With what compiler? Did you have any
problems?

Yes, with gcc-4.1 on Linux. I encountered various warnings which John
has already resolved.

* How much effort did you put into your evaluation?

About 6 hours.

* Are you knowledgeable about the problem domain?

I've written and used numerous programs with logging features, but have
never attempted to write a logging library.

And finally, every review should answer this question:

* Do you think the library should be accepted as a Boost library?

No, not in its current form.

Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk