Boost logo

Boost-Build :

From: John Maddock (john_at_[hidden])
Date: 2006-08-20 12:23:16


Zack Weinberg wrote:
> Please bear with me while I explain my problem.
>
> I'm involved in the development of Monotone
> (http://venge.net/monotone) which
> uses several Boost libraries, including boost::regex. We do not ship
> our own
> copy of Boost; on systems like Debian Linux with a rich prepackaged
> set of libraries we rely on the distributor's version. We've noticed
> that on many such systems the prepackaged boost::regex includes ICU
> support, which we do not use.
> We care because the ICU libraries drag in libpthread, and therefore,
> even if we
> use the non-threadsafe variants of the Boost libraries, we get
> thread-safe
> behavior from the C++ runtime, which hurts performance to the tune of
> 15% of runtime on some loads(!) [1]
>
> A logical solution to this problem is to split libboost_regex.so into
> two shared objects, let's call them libboost_regex_core.so and
> libboost_regex_icu.so, and arrange that the latter is only loaded at
> runtime if the program actually uses
> the ICU support. Looking at the source code, I see that ICU is
> well-isolated in its own source file, so it *could* be as simple as
> compiling
> libs/regex/src/icu.cpp into libboost_regex_icu.so and everything else
> into libboost_regex_core.so ... except that without further
> cleverness, this would
> break both build-time and runtime compatibility. Everyone currently
> expects
> that they can just do -lboost_regex at build time, and that a single
> DT_NEEDED
> for libboost_regex_<suffix>.so.NNN is sufficient at runtime.
>
> Now, at least with some linkers (notably GNU ld 2.17 or later)
> cleverness is possible: I append to this message a demonstration
> shell script which does it
> for a pair of mocked-up shared libraries. The key is the two
> variations of libfoobar.so -- the runtime version that exports no
> symbols but brings in both
> the 'base' libraries with DT_NEEDEDs, and the build-time version that
> is really
> a script telling GNU ld to pick up only those 'base' libraries that
> are needed. With this arrangement, existing programs linked against
> libboost_regex continue
> to work (and continue to drag in libraries they may not need);
> freshly compiled programs get only what they need. Don't take my
> word for it; run the script,
> then do ldd on the *_xdeps and *_better executables, and see the
> difference for yourselves.
>
> So my actual question to you, is how do I implement all of this in a
> Jamfile. Naturally I want to do this only when it's going to work
> (i.e. when the system
> has a toolset that understands the cleverness) and only when ICU
> support has
> been enabled to begin with. Also, as the goal is to get something
> into the
> Debian packages of boost 1.33.1, I need a solution that works there,
> not just on CVS HEAD.
>
> Any help would be appreciated. Note I'm reading the list via gmane,
> so cc:ing
> me directly will get my attention sooner.

I appreciate the problem (and have no idea how to accumplish that kind of
cleverness in a Jamfile). However the structure of the regex lib isn't
quite as simple as you suspect, there is basically:

1) Core routines (this is shared by everything else).
2) Narrow character specific code.
3) Wide character specific code.
4) Unicode specific code.
5) POSIX compatible C-interfaces.
6) The deprecated class RegEx.

So the problem is this: if ICU forces certain build requirements, those
filter up to section (1) and then down to all the other parts: since (1) has
to be binary compatible with all the others including the ICU part :-(

Even if you were to separate these out, would vendors really ship single
threaded versions of the narrow and wide character parts? What about folks
who want to use those in a multithreaded environment? There are certainly
plenty of people doing so.

Oh wait, reading over your post again, I see that you are using the
supposedly non-thread safe version of the Boost lib's? But these drag in
the thread safe runtime anyway? If I've got this right, then one trick
might be to not include ICU support in non-thread safe builds, and simply
say "if you need ICU then you have to use the thread safe version". Does
that make sence? If so then it may be a packaging problem, rather than a
Boost.Build one. Do you know if ICU always requires thread safety, or this
also a packaging choice?

Just trying to get my head around this, John.


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk