From: John Maddock (jm_at_[hidden])
Date: 2003-03-28 07:01:17
> I am writing a multithreaded Apache log parser that uses the Boost
> 1_29_0 regex split function to separate elements in the entry. Each
> thread parses a separate log file. The code seems to be working
> correctly on a 1-CPU system, but when I use a 14-CPU Sun server, I
> see massive locking (LCK column of prstat -amLvu username), and
> performance suffers horribly (as measured by the lines processed per
> second). I spent a lot of time checking to see where the locking was
> occurring. I went so far as to compile the code with Sun's Forte 6u2
> and use their analysis tools to identify the problem area. I've
> compiled all code (including Boost) with both gcc 3.2.2 and Forte to
> create 64-bit binaries, if that makes any difference.
> If I read the Forte analysis tools correctly, the place I'm seeing
> all the locking is the call to malloc in the void *operator
> new(unsigned long), which is called by
> boost::re_detail::match_results_base and _priv_match_data. Those are
> in turn called by query_match_aux, which is called by reg_grep2.
> Assuming I'm reading it right...
> At this point it seems like the issue is either with the library or
> my usage of it. Has anyone seen this before? Any pointers on what I
> may be doing wrong and how to fix it would be appreciated.
The looking is occurring in your runtime library rather than boost.regex as
such. You have two choices:
1) Use a custom allocator for the match_results class instance that you are
using that uses thread-specific memory pools.
2) Wait for the next release (probably still a couple of months away), which
will use much less dynamic memory allocation (almost none at all in
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk