Boost logo

Boost :

Subject: Re: [boost] [log] Patch: rotating file collector
From: Alexander Arhipenko (arhipjan_at_[hidden])
Date: 2013-04-24 11:09:37


Hi Andrey,

First of all, thanks for such a detailed review.

Please, find my answers/proposals below.

On Tue, Apr 23, 2013 at 11:27 PM, Andrey Semashev
<andrey.semashev_at_[hidden]> wrote:
> On Tuesday 23 April 2013 16:18:44 you wrote:
>> On Tue, Apr 23, 2013 at 3:58 PM, Alexander Arhipenko
> <arhipjan_at_[hidden]>wrote:
>> > I've added implementation of rotating file collector -
>> > the patch against boost log trunk attached.
>> [snip]
>> ...
>>
>> Thanks for the patch. I didn't look deeply into it yet but from the first
>> glance it looks like you changed the existing file collector ...
>> [snip]
>> ...
>
> I took a closer look now and my first impression was wrong - the original
> behavior is still accessible, this is good. However, I have several notes:
>
> 1. I would still prefer the interface to be distinct for different types of
> collectors, and not based on the file name pattern presence. These are
> distinct objects with different behaviour and it should be apparent from the
> collector construction what behavior it will implement.
>

Agree to that. I also had an intention to provide separate function for
rotating collector, but decided to expand existing one just to save typing.

> I'm prepared to rename the current file::make_collector method to better
> describe the current behavior (I'm open to suggestions on the new name). The
> new collector should be created with a different function, also appropriately
> named (file::make_stacking_collector?). It's also probably a good idea to
> extract collectors and the base file::collector interface to separate headers.
>

I see 2 options here.
First, have 2 functions: make_collector, make_rotating_collector.
Second, have:
a) enum rotation_method { rotate_sequential, rotate_??? }
  (enum collection_method ?)
and
b) pass additional parameter rotation_method to make_collector function

Personally I would prefer second approach.

> 2. The new collector should also be kept in the collectors repository. This
> way it is possible to create multiple sinks collecting files in the common
> target directory, without different collectors interferring with each other.
> Basically, you'll have to move file_collector_hook and
> enable_shared_from_this< file_collector > base classes to file_collector_base,
> as well as m_pRepository.
>
> This introduces an additional possible issue if the user tries to create
> different sinks with different collectors targeting the same directory. This
> case should be detected (e.g. by checking the type of the collector in
> file_collector_repository::get_collector) and handle it with an exception
> (log::setup_error).
>

The reason I haven't added rotating collector to repository
is the "possible issue" you are describing:
several collectors targeting one directory but different files.
In current implementation,
it's impossible to have more than 1 collector for one directory.
It is okay when the collector responsibilities are:
* files copying to target directory
* old files removal (according to max size and/or min free space setup)
...but does not work for rotating collector case.
To elaborate more, consider use case:
* /var/log - is a directory where all the system log files reside
* foo.log - foo service log file
* bar.log - bar service log file
Both foo and bar log files have to be rotated, e. g.:
foo.log (latest), foo.log.1 (previous), ..., foo.log.N (earliest).
Thus, user have to setup 2 collectors and will got log::setup_error.

The easiest way to make described use case working -
do not store rotating collectors in repository.
The more sophisticated approach would be:
use additional criteria for collector identification (file_name pattern?).

Please, comment on this.

> 3. In rotating_file_collector::scan_for_files, invalid file name pattern
> should also be handled with setup_error.
>

Agree, will do.

> 4. I think, it should be possible to avoid the do_rollover method.
> file_collector_base::store_file should be moved to file_collector, and there
> should be a separate rotating_file_collector::store_file method as well. Any
> common subroutines (like checking available free space, erasing old files,
> etc.) can be extracted as distinct functions into file_collector_base and
> called from store_file methods of the derivatives.
>

I don't see any advantages of eliminating do_rollover method
in favour of common subroutine methods (as well as vice versa ;) ).
I don't mind to implement approach you are describing,
but maybe you can provide additional arguments why do we need this change?

> 5. You don't seem to rename the collected files when the file pattern contains
> date/time (see date_rollover_policy). Is that right? Obviously, you will only
> have the date and time point of the rotation and not the file creation.
>

You're right that collected files are not renamed for date/time patterns.
I don't have any strong opinion about whether file name should indicate
file creation or file rotation time.
However, gut feeling inclines me to the second case (file rotation time).

Let's consider existing logging components.
E.g. Python's logging module provides TimedRotatingFileHandler to deal with
time based rotation.
This handler indicates file creation time in the rotated file name.
But the files are rotated periodically (file sizes could differ).

On the opposite, I have following real life use case.
User logins to system and wants to monitor events for day 23.
There are 2 log files in the log directory: file.log and file.log.2013-04-21.
If filename indicates creation time, then it's hard to guess
what file to look at: records could be found in both files.
If 2013-04-21 indicates rotation time,
then it's clear that the record for day 23 is in file.log.

It will be interesting to hear opinions from you.

> 6. Actually, the date/time support for the new collector seems quite broken.
> The current behavior, when the file is originally named by the sink, ensures
> that the date/time in the file name corresponds to the first records in the
> file. If the date/time is added by the collector, the timestamp will
> correspond to the latest written records, which it counterintuitive. The check
> in rotating_file_collector::scan_for_files prevents users from setting
> date/time in the sink, so basically, it will only be usable if there is only a
> file counter placeholder in the file name pattern.
>
> I'm not sure what is the best way to solve this. Maybe it's better to avoid
> dealing with date/time placeholders in the collector at all and only work with
> file counters. But scan_for_files should be changed so that it is possible to
> set a pattern with date/time placeholders in the sink backend. In that case I
> think it's reasonable to remove the file name pattern from the file collector
> construction and simply always add file counters right before the extension.
> That way the file name pattern is only set in the sink backend, which
> simplifies the interface a lot.
>

Idea of dealing with counters only seems reasonable to me.
I need to analyze this a little bit more...
Please, also consider my comments regarding point 5.

> 7. You should probably rebase your patch to the Boost trunk, as the library
> has been merged recently. For now it's pretty close to the SourceForge SVN, so
> you should not have any problems. I'm going to work in Boost SVN now, so the
> SourceForge SVN will get outdated eventually.
>

Will do.

>> Also, I would like to discuss following question with you :
>> > rotate_file invocation in ~text_file_backend -
>> > I would prefer to have this call during text_file_backend construction
>> > phase.
>>
>> I'll have to take a closer look later to answer that but it seems odd to
>> have to rotate something when nothing have been written yet. The
>> rotate_file call is intended to place the written file to the storage
>> governed by the collector, and at construction this file hasn't been
>> created yet. Could you describe why you need this?
>
> This question remains actual.
>
>

I'll try to explain why I'm requesting this.
Point 1 - when rotating collector is used,
file name that is currently being written is somewhat constant.
But when application shut down, it becames something like
file_name.log + ".1" (counter name pattern case).
I would like to have file name the same for both cases:
when service is running and when not running.
Also, if user collects log files to e.g. /var/log/archive folder but
the log file itself resides in /var/log directory,
then latest application log should be seeked in 2 place which is annoying.

Point 2 - user changed collector configuration (decreased min free space).
After starting application he would expect that some files will be removed,
but this will occur only on next rotation phase.
So, I would re-phrase:
* it's preferable to *try* rotate_file
  during text_file_backend construction phase
* there is no need to rotate_file during text_file_backend destruction

Andrey, I will wait for you response and continue working on patch.

Thanks and Regards


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk