Boost logo

Boost :

Subject: Re: [boost] [log] Review-ready version in the Vault
From: vicente.botet (vicente.botet_at_[hidden])
Date: 2009-02-11 21:34:30


----- Original Message -----
From: "Andrey Semashev" <andrey.semashev_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Wednesday, February 11, 2009 9:10 PM
Subject: Re: [boost] [log] Review-ready version in the Vault

>> Reading the code I see a lot of
>>
>> try{
>> //
>> } catch (...){
>> // Something has gone wrong.
>> }
>>
>> Does it mean that the application is unable to know when things go
>> wrong?
>
> Yes. Although extremely useful, logging is an auxiliary feature of the
> application, so it must not influence business logic. That is, users
> should not be bothered to wrap every log statement in a try/catch block.

IMO, it is up to the application to decide what to do with exceptional cases.
The log library could catch its own exceptions, but not the user exceptions, and in any case let the user think that a log has been done when it is not the case.

 
>> I'm mainly interested on how the library behaves on the context of
>> multi threaded programs, performances, ...
>>
>> A trivial question, are all the sources associated to all the sinks?
>
> Sources are not associated with sinks directly. A log record from any
> source may go to any one or several sinks. Which sinks will receive the
> record is entirely decided by filters.

So we can consider that sources + core + sinks form a single "log".

>> If this is the case, how the core is made thread-safe, a single
>> mutex, Which operations will occur more often read-only or write? It
>> is worth to use a shared_mutex?
>
> I assume that pushing log records through the core is a much more
> frequent operation than, say, registering sinks, changing filters or
> adding global attributes. Pushing records requires only read access to
> the core (well, to the thread-shared part of it, that is). That is why
> shared_mutex allows to significantly reduce contention for the major case.

Ok, I see the shared_mutex.
 
>> Why don't let the user the ability of
>> instantiating cores?
>
> Because the core contains all the set up of the library. Sources and
> some other parts of the library rely on that it is a singleton, which
> allows to simplify library interface significantly.

So the BoostLog library consist of a single log with multiple sources and multiple sinks.
The fact that we can declare as many loggers as we want staticly or as class member gives the illusion of multiple logs, but we have really just one core log.

If an application needs to log different things in separated files (different modules, different people, 3pp) we need to create two sinks, a tag dispatcher, add the corresponding filters, create to channel_loggers, and share the tag dispatcher between the different modules, people and possibly 3pp. It seem to me that this could not scale. We really need to instantiate the core log.
I don't think the explicit creation of a core log and the association of source and sinks to a core will complicate the library. The library could provide in addition a singleton log, so sources and sinks will be associated to this default core log.
 
<snip>
 
>> The single difference between synchronous_sink and asynchronous_sink
>> is that the later will do the formating and writing later on on an
>> other thread, isn't it? In both cases there is a mutex associated to
>> the sink. This mutex could be the bottleneck of the system, only one
>> mutex resource for all the user threads.
>
> Remember that the mutex is only associated to a single sink. Writing to
> different sinks can be concurrent. Theoretically, the logging core can
> employ some clever scheduling between sinks, but this is not currently done.
>
> As for asynchronous sink frontend, the mutex is only needed to protect
> the queue of records for the dedicated thread.

Yes but writing to different sources will check the filters on all the sinks. Doesn't this need to lock the mutex on the sink?

> I would happily use some
> lock-free queue for this purpose, but Boost doesn't have one and I don't
> feel confident to implement one. I plan to add support for Intel TBB later.

I don't feel confident neither, at least not yet :(
 
>> Another approach for the asynchronous_sink could be to use a queue of
>> log records for each thread (thread specific storage). Each log
>> records must be timestamped with the creation date and a monotonic
>> counter (time is not enough fine grained) and as the queue is
>> specific to the thread no need to use a mutex. On the other side
>> there is a concentrator which takes one by one the elements ordered
>> by the timestamp+counter.
>
> Something like that is planned in the future. However, I was going to
> use a single queue to interleave records during pushes.
>
>> So only the current thread can push on this queue because is specific
>> to the thread. There is a single thread, the concentrator that pops
>> from these queue. In this context we can ensure thread safety without
>> locking as far as the queue has at least two messages. (The Fork/Join
>> Java framework and Boost.ThreadPool use this technique).
>
> You still need to synchronize pushes and pops, no matter how many queues
> you use. You do have two threads accessing the same queue anyway. While
> this will surely reduce contention, it is not lock-free.

Please take a look to the Fork/Join Java framework. The isea is that we don't need to synchronize always because the threads modifies different memory variable as far as there is more thatn two message in the queue, so it is almost lock-free in this particular context 1 writer/1 reader. I feel confident with this particular case.

>> There is one more issue here, because all the threads can log, while
>> on the ThreadPool only the threads in the pool can add tasks on the
>> queue of the worker thread. So the concentrator needs to have access
>> to all the thread_specific storage.
>>
>> The Boost.Interthreads library defines a thread_specific_shared_ptr
>> class which extends the thread_specific_ptr class with synchronized
>> access to thread_specific_shared_ptr of a thread from another, either
>> giving the thread::id or iterating on a map : thread::id ->
>> shared_ptr<stored_data>.
>>
>> I have not done yet any performances comparation, but I think that
>> the bottleneck is avoided.
>
> Looking forward to see this library in Boost! The idea of separate
> queues, along with lock-free queue from TBB, looks very appealing.

You can already take a look at https://svn.boost.org/trac/boost/wiki/LibrariesUnderConstruction#Boost.InterThreads or download it from the vault: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=interthreads.zip&directory=Concurrent%20Programming&

The library defines an async_ostream in the deferred traces example. Note that in the example the queue are not yet lock free, I'm working on (I need to replace the std::queue by an array and two indexes).
 
>> In addition this will solve the issue with log records that are
>> weakly ordered in a multithreaded applications.

Do you plan at least to solve the weakly ordered issue before the review?
 
>> If I have time before the review I'll try to use this approach to
>> define an strict_asynchronous_sink which will ensure strict order and
>> minimize the contention. Have you some performance test I can use to
>> compare?
>
> No, I haven't done any dedicated performance tests yet. I will try to
> produce something a bit later.

Locking forward to make a comparation,
Vicente


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk