|
Boost : |
From: John Maddock (john_at_[hidden])
Date: 2003-11-04 08:07:00
Folks,
I've put together a set of draft guidelines come tutorial for libraries that
have separate source code - this is really just a distillation of my
experience with the regex library, updated to reflect Rene's forthcoming
install code. Most of what follows is unashamedly Windows centric, there
being much fewer issues effecting Unix shared libraries. It also doesn't
cover the build system / Jamfiles required, but I suspect that Rene is
working on that already. The complete text is shown below, as ever comments
are very welcome!
Thanks,
John.
****************************************************
Guidelines for Authors of Boost Libraries Containing Separate Source
These guidelines are designed for the authors of Boost libraries which have
separate source that need compiling in order to use the library.
Throughout, this guide refers to a fictitious "whatever" library, so replace
all occurrences of "whatever" or "WHATEVER" with your own library's name
when copying the examples.
Changes Affecting Source Code
Supporting Windows Dll's
On most Unix-like platforms no special annotations of source code are
required in order for that source to be compiled as a shared library.
However the majority of Windows compilers require that symbols that are to
be imported or exported from a dll, be prefixed with __declspec(dllimport)
or __declspec(dllexport). Without this mangling of source code, it is not
possible to correctly build shared libraries on Windows (historical note -
originally these declaration modifiers were required on 16-bit Windows where
the memory layout for exported classes was different from that of "local"
classes - although this is no longer an issue, there is still no way to
instruct the linker to "export everything", it also remains to be seen
whether 64-bit Windows will resurrect the segmented architecture that led to
this problem in the first place. Note also that the mangled names of
exported symbols are different from non-exported ones, so
__declspec(dllimport) is required in order to link to code within a dll).
In order to support the building of shared libraries on MS Windows your code
will have to prefix all the symbols that your library exports with a macro
(lets call it BOOST_WHATEVER_DECL) that your library will define to expand
to either __declspec(dllexport) or __declspec(dllimport) or nothing,
depending upon how your library is being built or used. Typical usage would
look like this:
#ifndef BOOST_WHATEVER_HPP
#define BOOST_WHATEVER_HPP
#include <boost/config.hpp>
#ifdef BOOST_HAS_DECLSPEC // defined in config system
// we need to import/export our code only if the user has specifically
// asked for it by defining BOOST_ALL_DYN_LINK if they want all boost
// libraries to be dynamically linked, or BOOST_WHATEVER_DYN_LINK
// if they want just this one to be dynamically liked:
#if defined(BOOST_ALL_DYN_LINK) || defined(BOOST_WHATEVER_DYN_LINK)
// export if this is our own source, otherwise import:
#ifdef BOOST_WHATEVER_SOURCE
# define BOOST_WHATEVER_DECL __declspec(dllexport)
#else
# define BOOST_WHATEVER_DECL __declspec(dllimport)
#endif // BOOST_WHATEVER_SOURCE
#endif // DYN_LINK
#endif // BOOST_HAS_DECLSPEC
//
// if BOOST_WHATEVER_DECL isn't defined yet define it now:
#ifndef BOOST_WHATEVER_DECL
#define BOOST_WHATEVER_DECL
#endif
//
// this header declares one class, and one function by way of examples:
//
class BOOST_WHATEVER_DECL whatever
{
// details.
};
BOOST_WHATEVER_DECL whatever get_whatever();
#endif
And then in the source code for this library one would use:
//
// define BOOST_WHATEVER SOURCE so that our library's
// setup code knows that we are building the library (possibly exporting
code),
// and not using it (possibly importing code):
//
#define BOOST_WHATEVER_SOURCE
#include <boost/whatever.hpp>
// class members don't need any further annotation:
whatever::whatever()
{
}
// but functions do:
BOOST_WHATEVER_DECL whatever get_whatever()
{
return whatever();
}
Importing/exporting dependencies
As well as exporting your main classes and functions (those that are
actually documented), Microsoft Visual C++ will warn loudly and often if you
try to import/export a class whose dependencies are not also exported.
Dependencies include: any base classes, any user defined types used as data
members, plus all of the dependencies of your dependencies and so on. This
causes particular problems when a dependency is a template class, because
although it is technically possible to export these, it is not at all easy,
especially if the template itself has dependencies which are
implementation-specific details. In most cases it's probably better to
simply suppress the warnings using:
#ifdef BOOST_MSVC
# pragma warning(push)
# pragma warning(disable : 4251 4231 4660)
#endif
// code here
#ifdef BOOST_MSVC
#pragma warning(pop)
#endif
This is safe provided that there are no dependencies that are (template)
classes with non-constant static data members, these really do need
exporting, otherwise there will be multiple copies of the static data
members in the program, and that's really really bad.
Historical note: on 16-bit Windows you really did have to export all
dependencies or the code wouldn't work, however since the latest Visual
Studio .NET supports the import/export of individual member functions, it's
a reasonably safe bet that Windows compilers won't do anything nasty - like
changing the class's ABI - when importing/exporting a class.
Rationale:
Why bother - doesn't the import/export mechanism take up more code that the
classes themselves? A good point, and probably true, however there are some
circumstances where library code must be placed in a shared library - for
example when the application consists of multiple dll's as well as the
executable, and more than one those dll's link to the same Boost library -
in this case if the library isn't dynamically linked and it contains any
global data (even if that data is private to the internals of the library)
then really bad things can happen - even without global data, we will still
get a code bloating effect. Incidentally, for larger applications,
splitting the application into multiple dll's can be highly advantageous -
by using Microsoft's "delay load" feature the application will load only
those parts it really needs at any one time, giving the impression of a much
more responsive and faster-loading application.
Why static linking by default? In the worked example above, the code
assumes that the library will be statically linked unless the user asks
otherwise. Most users seem to prefer this (there are no separate dll's to
distribute, and the overall distribution size is often significantly smaller
this way as well: i.e. you pay for what you use and no more), but this is a
subjective call, and some libraries may even only be available in dynamic
versions (Boost.threads for example). Suggestion: use
BOOST_WHATEVER_STATIC_LINK to force static linking of a library that is
dynamic by default, and use BOOST_WHATEVER_DYN_LINK to force dynamic linking
of a library that is static by default.
Fixing the ABI of the Compiled Library
There are some compilers (mostly Microsoft Windows compilers again!), which
feature a range of compiler switches that alter the ABI of C++ classes and
functions. By way of example, consider Borland's compiler which has the
following options:
-b (on or off - effect emum sizes).
-Vx (on or off - empty members).
-Ve (on or off - empty base classes).
-aX (alignment - 5 options).
-pX (Calling convention - 4 options).
-VmX (member pointer size and layout - 5 options).
-VC (on or off, changes name mangling).
-Vl (on or off, changes struct layout).
These options are provided in addition to those effecting which runtime
library is used (more on which later); the total number of combinations of
options can be obtained by multiplying together the individual options
above, so that gives 2*2*2*5*4*5*2*2 = 3200 combinations!
The problem is that users often expect to be able to build the Boost
libraries and then just link to them and have everything just plain work, no
matter what their project settings are. Irrespective of whether this is a
reasonable expectation or not, without some means of managing this issue,
the user may well find that their program will experience strange and hard
to track down crashes at runtime unless the library they link to was built
with the same options as their project (changes to the default alignment
setting are a prime culprit). One way to manage this is with "prefix and
suffix" headers: these headers invoke compiler specific #pragma directives
to instruct the compiler that whatever code follows was built (or is to be
built) with a specific set of compiler ABI settings.
Boost.config provides the macro BOOST_HAS_ABI_HEADERS which is set whenever
there are prefix and suffix headers available for the compiler in use,
typical usage is like this:
#ifndef BOOST_WHATEVER_HPP
#define BOOST_WHATEVER_HPP
#include <boost/config.hpp>
// this must occur after all of the includes and before any code appears:
#ifdef BOOST_HAS_ABI_HEADERS
# include BOOST_ABI_PREFIX
#endif
//
// this header declares one class, and one function by way of examples:
//
class whatever
{
// details.
};
whatever get_whatever();
// the suffix header occurs after all of our code:
#ifdef BOOST_HAS_ABI_HEADERS
# include BOOST_ABI_SUFFIX
#endif
#endif
Rationale:
Without some means of managing this issue, users often report bugs along the
line of "Your silly library always crashes when I try and call it" and so
on. These issues can be extremely difficult and time consuming to track
down, only to discover in the end that it's a compiler setting that's
changed the ABI of the class and/or function types of the program compared
to those in the pre-compiled library. The use of prefix/suffix headers can
minimize this problem, although probably not remove it completely.
Counter Argument #1:
Trust the user, if they want 13-byte alignment (!) let them have it.
Counter Argument #2:
Prefix/suffix headers have a tendency to "spread" to other boost libraries -
for example if boost::shared_ptr<> forms part of your class's ABI, then
including prefix/suffix headers in your code will be of no use unless
shared_ptr.hpp also uses them. Authors of header-only boost libraries may
not be so keen on this solution - with some justification - since they don't
face the same problem.
Enabling Automatic Library Selection and Linking
Many Windows compilers ship with multiple runtime libraries - for example
Microsoft Visual Studio .NET comes with 6 versions of the C and C++ runtime.
It is essential that the Boost library that the user links to is built
against the same C runtime as that that the program is built against. If
that is not the case, then the user will experience linker errors at best,
and runtime crashes at worst. The Boost build system manages this by
providing different build variants, each of which is build against a
different runtime, and gets a slightly different mangled name depending upon
which runtime it is built against. For example the regex libraries get
named as follows when built with Visual Studio .NET 2003:
boost_regex-vc71-mt-1_31.lib
boost_regex-vc71-mt-gd-1_31.lib
libboost_regex-vc71-mt-1_31.lib
libboost_regex-vc71-mt-gd-1_31.lib
libboost_regex-vc71-mt-s-1_31.lib
libboost_regex-vc71-mt-sgd-1_31.lib
libboost_regex-vc71-s-1_31.lib
libboost_regex-vc71-sgd-1_31.lib
The difficulty now is selecting which of these the user should link his or
her code to.
In contrast, most Unix compilers typically only have one runtime (or
sometimes two if there is a separate thread safe option). For these systems
the only choice in selecting the right library variant is whether they want
debugging info, and possibly thread safety.
Historically Microsoft Windows compilers have managed this issue by
providing a #pragma option that allows the header for a library to
automatically select the library to link to. This makes everything
automatic and extremely easy for the end user: as soon as they include a
header file that has separate source code, the name of the right library
build variant gets embedded in the object file, and as long as that library
is in the linker search path, it will get pulled in by the linker without
any user intervention.
This feature can be enabled for Boost libraries by including the header
<boost/config/auto_link.hpp> after defining BOOST_LIB_NAME and
BOOST_DYN_LINK as required:
//
// Automatically link to the correct build variant where possible.
//
// Set the name of our library, this will get undef'ed by auto_link.hpp
// once it's done with it:
//
#define BOOST_LIB_NAME boost_whatever
//
// If we're importing code from a dll, then tell auto_link.hpp about it:
//
#if defined(BOOST_ALL_DYN_LINK) || defined(BOOST_WHATEVER_DYN_LINK)
# define BOOST_DYN_LINK
#endif
//
// And include the header that does the work:
//
#include <boost/config/auto_link.hpp>
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk