Boost logo

Boost :

From: Daniela Engert (dani_at_[hidden])
Date: 2024-07-07 16:32:30


Am 07.07.2024 um 12:25 schrieb John Maddock via Boost:
> On 07/07/2024 07:56, Daniela Engert via Boost wrote:
>> Hi,
>>
>> what you're describing is the technique what I coined the term "dual
>> mode support" for in my presentations. I've added it to {fmt} in 2021
>> to enable the use of the library as both #include and import; MS-STL
>> adopted it for their implementation of the standard library
>>
>> Please take into account that I usually also add something along
>> xxx_ATTACH_TO_GLOBAL_MODULE (in this case xxx = BOOST) to avoid
>> potential ODR violations for projects that use a library either as
>> module or through headers. Otherwise users may end up with
>> conflicting, but apparently *identical* definitions that they cannot
>> decipher the reason for: one definition is attached to the named
>> module, the other to the global module. The compiler will be very
>> upset. Linkers usually work.
> Sorry, don't quite follow, can you explain?

Ok, a simple example. Consider this definition in an internal header.
This acts just as a stand-in for ex. definitions in a Boost library:

// internal.h
namespace lib {
BOOST_MODULE_EXPORT struct S { int s; };
}

Use it as a traditional include (the public interface):

// lib.h
#pragma once
#ifndef BOOST_MODULE_EXPORT
# define BOOST_MODULE_EXPORT
#endif
#include "internal.h"

Use it as a C++20 module (the public interface):

// module lib
export module lib;
#define BOOST_MODULE_EXPORT export
#include "internal.h"

3rd-party TUs:

// a.h
#include <lib.h>
class A { lib::S s; };

// b.h
import lib;
class B { lib::S s; };

An unsuspecting consumer brings both 3rd-party TUs into the same TU. The
ordering is irrelevant for the purpose of reasoning:

// c.cpp
#include <a.h>
#include <b.h>
struct C { A a; B b; };

Now you end up with *two* definitions of the same entity S:

 1. one attached to the global module (through lib.h)
 2. one attached to the named module lib (through module lib)

C++ programmers can't discern them by language means, compilers and
linkers do. In the first case it is entity S, in the second it is entity
S_at_lib. From the compiler perspective, they have the same name but
different attachment - oops. From the linker perspective they have
different symbols and/or other properties in the object files. The
latter would be fine with it, the former will teach you in no uncertain
language that this is a no-no. User will scratch their head about why
they are told that an entity is the same but conflicting.

In general, export entities *either* through a header interface *or*
through a module interface, but never both. Compilers take advantage of
different attachments and lazily load module interfaces only when really
needed.

If you as an author want to support mixing of both traditional #includes
and module imports, then you'd better make sure that entity S is
attached to the global module in both cases. Like along these lines:

// module lib
export module lib;
#define BOOST_MODULE_EXPORT export
#ifdef BOOST_xxx_ATTACHED_TO_GLOBAL_MODULE
extern "C++" {
#endif
#include "internal.h"
#ifdef BOOST_xxx_ATTACHED_TO_GLOBAL_MODULE
}
#endif

This way, name attachment is a module compile-time option.

I use this f.e. in my Asio module for a long time.

>>
>> Boost libraries that intend to support modules *must* also *add*
>> tests that check that the exported entities are
>>   * available on the consumer side
>>   * instantiation and name lookup works when called from the consumer
>> side
> Absolutely.  Complete lack of build/tooling support for modules is
> currently a showstopper here.  When experimenting with a Regex module
> I basically had to resort to the command line.

I'm surprised. I've given talks on modules since 2019 then they became
kind of usable in MSVC, use them in production since the beginning of
2022, gave a CppCon keynote about a non-trivial application using only
modules and the modularized standard library in 2022 (using MSBuild),
and gave talks how to use them cross-platform with CMake in 2023.

If you happen to have gcc in mind, then you're in a tough spot, though

>>
>> Module implementation units have two relevant interfaces, not one.
> Also don't follow, can you explain?

The implementer-facing interface is excercised while *compiling* the
module interface unit. This means, the module can be compiled, name
lookup of all non-dependent entities and 1st-phase name lookup of
dependent entities succeeds, and overload resolution takes place as much
as name lookup allows.

The user-facing interface is excercised while *using* a module in a
different TU. The exported entities must be found by name lookup, ADL
and 2nd-phase name lookup of module-internal entities must succeed for
instantiations of exported templates. This includes lookup in
non-exported namespaces buried within the module performed in a
non-module TU!

Therefore, the 2nd (mostly forgotten) interface is the crucial one!

>> Almost all libraries that claim module capability support, and that
>> I've seen so far, fail to do that. In rare cases I have forks that
>> add this missing piece.
>> Lacking those gives users horrible experiences and modules a bad
>> reputation.
>>
>> On the standard library: *all* C++ implementers of the standard
>> library have agreed to support C++20 modules also for C++20 build
>> modes. So you're not tied to C++23.
> Good.  But if the module uses "import std;" and something somewhere
> else #includes <iostream> my experience is that everything breaks. 
> It's not supposed to, but it does.  I must test the latest MSVC
> release though.
>>
>> And lastly, there is BMI compatibility and the story of compiler
>> flags matching. But that's for a different day.
>
> Right, there is no possibility of a build-and-install step, each
> project must build the modules they are using from source using
> options that exactly match their project.  It's not hard, but it is a
> barrier to adoption.
>
> John.

Thanks,
Dani

-- 
PGP/GPG: 2CCB 3ECB 0954 5CD3 B0DB 6AA0 BA03 56A1 2C4638C5

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk