Subject: [boost] [Xpressive] vs libpcre performance
From: Sebastian Redl (sebastian.redl_at_[hidden])
Date: 2008-09-13 14:12:01
A German journal recently published a small programming contest with a
very simple text processing problem. The program needs to renumber and
sort footnotes in a long text. I wrote a solution in C++ using a number
of Boost libraries, among them Xpressive, but also Interprocess for
memory-mapping the input file.
Then I looked at the submissions page and found one C++ solution already
there, an admirably short (but rather inflexible) program using libpcre.
Comparing the performance, my program was sightly but consistently
slower than this small program, even when I felt that my usage of
memory-mapping and then scanning the whole file instead of going line by
line ought to give me a speed advantage.
So to test the performance I took the existing submission and replaced
libpcre with Xpressive (see attached file). I believe the solutions to
be functionally equivalent. However, the original takes 6 seconds to
process a 55MB file, whereas my variation takes ~15 seconds. That's on
the second run of each program, meaning that the entire file is in the
OS cache. This seems awfully slow.
Has anyone done a proper performance comparison between Xpressive and
libpcre? Boost.Regex's performance pages lists PCRE, but in version 4.1,
where 7.8 is the most recent release.
Athlon 64 2000MHz (64-bit mode)
Linux 2.6.23, GCC 4.1.2
Boost trunk as of 2008-09-11
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk