|
Boost : |
Subject: Re: [boost] [log] Review-ready version in the Vault
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2009-02-11 15:10:50
vicente.botet wrote:
> Oh, I forget, Bravo! Excellent documentation.
Thanks. :)
> Reading the code I see a lot of
>
> try{
> //
> } catch (...){
> // Something has gone wrong.
> }
>
> Does it mean that the application is unable to know when things go
> wrong?
Yes. Although extremely useful, logging is an auxiliary feature of the
application, so it must not influence business logic. That is, users
should not be bothered to wrap every log statement in a try/catch block.
> I'm mainly interested on how the library behaves on the context of
> multi threaded programs, performances, ...
>
> A trivial question, are all the sources associated to all the sinks?
Sources are not associated with sinks directly. A log record from any
source may go to any one or several sinks. Which sinks will receive the
record is entirely decided by filters.
> If this is the case, how the core is made thread-safe, a single
> mutex, Which operations will occur more often read-only or write? It
> is worth to use a shared_mutex?
I assume that pushing log records through the core is a much more
frequent operation than, say, registering sinks, changing filters or
adding global attributes. Pushing records requires only read access to
the core (well, to the thread-shared part of it, that is). That is why
shared_mutex allows to significantly reduce contention for the major case.
> Why don't let the user the ability of
> instantiating cores?
Because the core contains all the set up of the library. Sources and
some other parts of the library rely on that it is a singleton, which
allows to simplify library interface significantly.
> If I have understood on multi threaded applications we don't need to
> always use logger_mt but we need to use synchronous_sink or
> asynchronous_sink.
You only have to use *_mt loggers if the logger may be used by several
threads simultaneously. In all other cases regular loggers are a better
choice, since they are faster.
As for sinks, it is the sink backend who decides. All sink backends
provided by the library out of box will, indeed, require you to use
either synchronous_sink or asynchronous_sink in multithreading
environment. However, it is possible to create a backend that will not
require that (for example, if the thread synchronization is done
somewhere else, not in the sink frontend).
> The single difference between synchronous_sink and asynchronous_sink
> is that the later will do the formating and writing later on on an
> other thread, isn't it? In both cases there is a mutex associated to
> the sink. This mutex could be the bottleneck of the system, only one
> mutex resource for all the user threads.
Remember that the mutex is only associated to a single sink. Writing to
different sinks can be concurrent. Theoretically, the logging core can
employ some clever scheduling between sinks, but this is not currently done.
As for asynchronous sink frontend, the mutex is only needed to protect
the queue of records for the dedicated thread. I would happily use some
lock-free queue for this purpose, but Boost doesn't have one and I don't
feel confident to implement one. I plan to add support for Intel TBB later.
> Another approach for the asynchronous_sink could be to use a queue of
> log records for each thread (thread specific storage). Each log
> records must be timestamped with the creation date and a monotonic
> counter (time is not enough fine grained) and as the queue is
> specific to the thread no need to use a mutex. On the other side
> there is a concentrator which takes one by one the elements ordered
> by the timestamp+counter.
Something like that is planned in the future. However, I was going to
use a single queue to interleave records during pushes.
> So only the current thread can push on this queue because is specific
> to the thread. There is a single thread, the concentrator that pops
> from these queue. In this context we can ensure thread safety without
> locking as far as the queue has at least two messages. (The Fork/Join
> Java framework and Boost.ThreadPool use this technique).
You still need to synchronize pushes and pops, no matter how many queues
you use. You do have two threads accessing the same queue anyway. While
this will surely reduce contention, it is not lock-free.
> There is one more issue here, because all the threads can log, while
> on the ThreadPool only the threads in the pool can add tasks on the
> queue of the worker thread. So the concentrator needs to have access
> to all the thread_specific storage.
>
> The Boost.Interthreads library defines a thread_specific_shared_ptr
> class which extends the thread_specific_ptr class with synchronized
> access to thread_specific_shared_ptr of a thread from another, either
> giving the thread::id or iterating on a map : thread::id ->
> shared_ptr<stored_data>.
>
> I have not done yet any performances comparation, but I think that
> the bottleneck is avoided.
Looking forward to see this library in Boost! The idea of separate
queues, along with lock-free queue from TBB, looks very appealing.
> In addition this will solve the issue with log records that are
> weakly ordered in a multithreaded applications.
>
> If I have time before the review I'll try to use this approach to
> define an strict_asynchronous_sink which will ensure strict order and
> minimize the contention. Have you some performance test I can use to
> compare?
No, I haven't done any dedicated performance tests yet. I will try to
produce something a bit later.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk