Boost logo

Boost :

From: John Maddock (jm_at_[hidden])
Date: 2002-10-24 05:52:26


> I compiled 1.29.0 (Linux, gcc 3.2) and noticed that regexp library
> (release, shared, stripped) has around 500kb. Checking the installed
> system libraries, rx, perlre and regex are in the 50-100kb range. I
> believe this is a bit much (and used to be even more with the older
> versions of gcc).
>
> I only took a cursory glance at the source, and couldn't find anything
> too much out of the ordinary - mainly just basic STL... has anybody else
> checked this for codebloat? (and I don't believe it's /just/ normal
> template inflation, since my own code that uses boost::signal and
> assorted STL currently compiles under 100kb).

Ok here's the low-down:

When compiled as a shared library there is a lot of stuff in there: three
different traits classes (you'll only use one of these, but which one?),
support for two different character types (char and wchar_t), support for
POSIX API layers, and a high level simplified wrapper class that some
newbees find useful (which also includes thing like file searching).

If you compile and link as a static library then you obviously only link to
what you need, and code size should be in the 100K range (it depends a bit
on which traits class you use - if you start using std::locale then code
size can go up quite a bit).

There are also the more commonly used template instances in the library
(reg_expression<char> and reg_expression<wchar_t>), this helps with both
compile and link times, and BTW your object files will be a lot smaller as a
result. Again do a static link and you only pay for what you use.

I suppose I should refactor into multiple shared libs, but IMO that's a
maintenance nightmare: I'm not saying it won't happen, just that it's not as
trivial as it sounds :-(

Finally there is one problem here: because regexes are interpreted strings,
you pay for every feature you might use, not for those that you actually do
use. User feedback up until now has been driving towards more features not
less, on the other hand you could probably put together a very basic
toy-regex implementation in about 10K, IMO it's use would be very limited
though.

John Maddock
http://ourworld.compuserve.com/homepages/john_maddock/index.htm


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk