Boost logo

Boost :

Subject: Re: [boost] [logging] Interest check on logging library.
From: Jason Wagner (jason_at_[hidden])
Date: 2009-01-01 23:36:01


On Thu, 01 Jan 2009 09:51:39 -0500, Andrey Semashev wrote:

> Jason Wagner wrote:
>> It started because I thought yours was no longer in development. I
>> have no idea why I thought that, since looking at your cvs logs you
>> have been consistently working on it. It was largely a mistake.
>
> The library is very much alive, although for various reasons I didn't
> touch it for the last few weeks.

Yes, it was completely my mistake for thinking that it was not.

>> I'm writing a internet spaceship game. There's a client and a
>> cluster of servers with different types. The protocol is simple
>> message passing, with the serialization code shared between client
>> and server. This leads to a very simple example like:
>> if(message.size > packet.size) { log << severity::warn() << "Packet
>> was larger than the buffer that contains it!" <<
>> build_target::server() << " Packet source= " << connection << ",
>> account = " << user << "\n" << dump_hex(packet); // error handling }
>> On the client side, we note that the packet was too big and move on.
>> On the server side, we actually log the packet since evil hackers are
>> always trying to steal my space bucks. Since the template generates
>> empty code at compile time, both gcc and msvc are smart enough to
>> remove the " Packet source=" and ", account= " strings from the
>> binary, providing less clue in the client as to what the server is
>> watching for. On the server side, I probably want this record to go
>> into a database, so having the entire thing as a single record is
>> very important. That precludes items of the form:
>> log << severity::warn() << "..."; if(server) // or #ifdef log <<
>> "Packet source = "; //...
>> I also really don't like having the conditional in the logging.
>> Looking at the serialization code, one shouldn't have to wade through
>> the logic of logging.
>
> I see two issues with your approach:
> 1. Although compilers may be smart enough to optimize away writing
> constants and simple variables into a dummy stream (which, I assume, log
> is when logging is disabled), I'm not so sure about more complex
> expressions, like writing the result of a function call, like dump_hex
> in your example. I didn't investigate this, but I suspect the call will
> still be present in the compiled code, which, AFAICT, is not what you
> intended.

MSVC 9 is actually smart enough to do it if there are no side effects.
gcc says it is, but it's definition of "side-effect" seems to cover any
writes to memory at all. I haven't found a situation where it actually
does optimize the call away. The solution I'm going with is lazy
functions using boost::function and boost bind. There's a wrapper called
lazy() that is templated similar to boost bind and returns a
boost::function that is only evaluated if the statement will be logged.

> 2. I'm not sure how you would implement filtering. More on this below.
>
> From where I'm standing, I would suggest using macros. It's not that
> bad in this particular area and would not require changing your code a
> lot:
>
> if(message.size > packet.size)
> {
> LOG() << ...;
> }
>
> where LOG can be defined differently for client and server builds. In
> connection to my library, it could be:
>
> #ifdef SERVER
> # define LOG(logger) BOOST_LOG(logger)
> #else // client
> # define LOG(logger) while (false) BOOST_LOG(logger)
> #endif
>
> That will drop unnecessary code for sure on practically any compiler.
>
>> I think I could add this to your library using features. That's not
>> a light weight thing to add, based upon what I see features need to
>> be instantiated. In my scheme, since the if() test is built into the
>> log record instantiation there's single point to change to affect all
>> appropriate loggers. With your scheme I'd have to change the logger
>> type, the if()s that wrap them or the macros that contain the ifs,
>> and possibly client code as well.
>> Switching to a different scheme, I could do:
>> log << severity::warn() << "Packet was larger than the buffer that
>> contains it!" << auditing::full() << " Packet source= " <<
>> connection << ", account = " << user << "\n" << dump_hex(packet);
>
> The problem is that you ignore run time filtering here, which should
> happen before formatting the record (i.e. before the streaming statement
> you mention above). If you are fine with it, your approach is
> sufficient. Otherwise you'll eventually come to something more complex,
> maybe something like my lib. :)

No, run time filtering works just fine. The type severity::warn is an
typedef for a type and an integer-- basically an enum. There's a trait
for turning the type into a string. Each insertion of a tag adds to an
mpl::map of types to integers (well, integer types). As soon as there's
something that actually gets logged, all of the compile time maps gets
turned into a run time attribute/variable. During record generation, any
time there's a variable change via a tag or explicit setting, the filter
chain is re-evaluated and a new list of sinks is made:

Compile time filters:
severity > debug

Runtime filters:
when (severity >= warning) log to "critical.txt"
when (severity >= info) log to "info.txt"

Statement:
log << severity::debug() // 1
     << "debug message" // 2
     << severity::error() // 3
     << "error message" // 4
     << severity::info() // 5
     << "info message" // 6
     << severity::debug() << "last text"; // 7

1 -- inserting a tag. This tag is disabled at compile time and the <<()
call is an empty function returning a message<false> temporary.
2 -- inserting content. The message<false> object has an empty <<()
function and returns itself.
3 -- inserting tag. This updates the mpl::map and returns a message<true>
object. No actual work is done, just changes the type of the temporary
returned.
4 -- inserting content. The message<true>::operator<<() knows that it's
the first effective call, so takes the mpl::map it carries, sets all the
tags to run-time variables, finds all the interested sinks (critical.txt
and info.txt), then returns.
5 -- inserting tag. This changes the mpl::map, but is still compile time
enabled. It also resets severity in the attribute map, which causes sinks
to be reevaluated, only "info.txt" is listening now.
6 -- inserting content. The pending record for "info.txt" is appended,
the record for "critical.txt" is not.
7 -- inserting tag. The mpl::map gets a new tag, which fails the compile
time condition. The op returns message<false> and eats the next peice of
text in a no-op call.

Run time filters are evaluated only when they need to be, but can be
invoked in the middle of a record.

> If all you need is to globally enable or disable logging at compile
> time, you can define macros as I've shown above. There's no need to play
> with loggers or complicate your code with conditionals. If you need run
> time filtering, the following constraints appear:
> 1. You cannot mix the attributes (sorry, I don't know the equivalent
> term for your library) into the streaming expression. The reason is
> obvious: attributes participate in filtering, and filtering takes place
> before the streaming.
> 2. You will have to write an "if" around the streaming statement. That
> "if" will check the filters in run time and decide whether the record
> will be processed or not. You can wrap it into a macro, but it will be
> there, and it will not be zero-cost in terms of performance and code
> size.
> 3. All attributes you want to put to log have to be set up before that
> "if" statement. My lib provides scoped attributes (or tags) for that
> purpose, although they may not look as neat as your manipulators in the
> streaming statement. In your use case the scoped attributes can be set
> in the macros I've shown above, so that they will not be present in the
> client code.

They also potentially participate in the formatting, even when they don't
participate in the filtering. In those cases, I would consider them part
of the record. It also makes sense in my use cases since the conditionals
occur within the insertion operations themself, not around the statement
as a whole.

>> Now, on the client side I can turn off full auditing at compile time
>> and on the server time I can turn off packet logging at run time and
>> have a condition that turns it on when I suspect a problem. There is
>> a separation of concerns between what audit level and severity of the
>> record.
>> The whole thing could be rewritten with scope variables for
>> connection and user that would allow them to be first class denizens
>> of the database or fed into an event processing system without having
>> to do entity extraction on the record. In addition, immediately
>> insertable variables prevent the overhead of setting scope variables
>> if not needed:
>> log << severity::warn() << "Packet was larger than the buffer that
>> contains it!" << auditing::full() << << set("User", user) <<
>> set("connection",connection) " Packet source= " << connection << ",
>> account = " << user << "\n" << dump_hex(packet);
>
> I'm sorry, I'm afraid, I don't understand what you're saying here.

Part of the problem is that I was typing this when I mistakenly sent the
email. Let me rewrite it:

The sink is a database table with columns severity, user, ip_address, and
message. It maps the variables "user" to column user, "connection" to
connection, "severity" to severity, and the record text to the message
column. I do not want to split up the record into two rows, but I don't
want to dump the packet information except when I'm doing a full audit.
Note that the condition has nothing to do with the severity, only with the
degree of auditing I'm doing.

auditing := sparse < moderate < detailed;

log << severity::warn()
    << "Packet was larger than the buffer that contains it!"
    << set("User", user) << set("connection",connection)
    << " Packet source= " << connection
    << ", account = " << user << "\n"
    << auditing::detailed() << dump_hex(packet);

By default, I set a filter that checks to make sure that auditing <=
sparse. The record will be built without a hex dump of the packet in it,
the auditing::detailed() removes the followign information from it.
Later, if there's reason to suspect problems I can load the filter with
the program running, change the condition to auditing <= detailed, and
suddenly the record will contain a hex dump of packets that over flowed.

In addition, since "user" and "connection" are attributes/variables, I can
put a filter on them so that I have detailed auditing on certain users or
IP addresses, but not on other records.

The thing of key importance is that it's all in a single logging record,
not split between multiple lines. It's one packet of information with
conditional content. That makes it far easier to view logs through GUIs,
databases, or any other type of structured viewing. For systems that
generate tens to hundreds of megabytes of logging per day, that's fairly
important for usability. Grep only goes so far.

-- 
Jason Wagner
jason_at_[hidden]

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk