Boost logo

Boost :

From: Stephan T. Lavavej (stl_at_[hidden])
Date: 2024-04-08 23:02:48


[Ruben Perez]
> Apologies if this is a dumb question, but what does "strong ownership" mean?

It's a good question which I'm only partially qualified to answer (I know a lot about what it's taken to modularize the entire Standard Library, but there's a lot of Core Language stuff that I haven't needed to learn for this narrow task).

My understanding is that the Standard doesn't specify (i.e. leaves it up to implementations to decide) what happens when two modules internally use the same names for different things. Consider a Cats module and a Dogs module. import Cats; makes Cats::meow() available, and import Dogs; makes Dogs::woof() available, because they've marked Cats::meow() and Dogs::woof() with the export​ keyword. These modules can be imported by the same TU (or separate TUs in the same program) and used without any conflicts, because they aren't trying to export the same names.

But what if the Cats module relies on non-exported, completely internal machinery details::make_noise(), and the (independently written and maintained) Dogs module also happens to have internal machinery named details::make_noise(), that does totally different canine things?

In the classic, non-modules world, the answer is clear - this is an ODR violation with undefined behavior, and you'll get a linker error if you're very lucky. (Header-only code, or statically linked separately compiled code, provides no isolation. DLLs do, but the Standard doesn't recognize their existence.)

Modules provide more structure because when the Cats module is built, it knows that it's for Cats machinery. So an implementation is allowed to make the Cats module "strongly own" non-exported symbols like details::make_noise(). (My understanding is that this results in details::make_noise() being mangled to reflect the fact that Cats owns it.)

If an implementation chooses the strong ownership strategy, then Cats can have its details::make_noise() coexist with Dogs also having its details::make_noise(), and there is no conflict, no ODR violation, and everything is fine. This intuitively makes sense, because each module strictly controls its exported surface area, and its implementation details shouldn't matter to other modules, and users should be able to freely combine modules.

For reasons that I completely do not understand with my cat-sized brain that is barely able to print "3.14", I believe that only MSVC has chosen the strong ownership strategy, while Clang and GCC have chosen "weak ownership" (which is perhaps easier to understand - with that strategy, details::make_noise() isn't specially affected by whether it appears in module Cats or module Dogs, so you get the same kind of ODR violation that classic headers would produce). Apparently compiler devs feel really strongly about both sides of this issue and I don't know why.

Anyways, this is relevant to making modules code interact with classically compiled code, because the classically compiled code doesn't know anything about modules and isn't attached to any named modules.

There's the "global module fragment" (again, something I'm only partially qualified to talk about), which is a structured way to say "hey, no module owns any of this stuff". My understanding is that it can be a good way to deal with entire libraries that are classic and haven't been modularized. However, I found that it wasn't really suited for dealing with a pre-existing mostly-header-only library that occasionally declares separately compiled machinery in the middle of its usual header-only definitions. I ended up using the GMF for UCRT machinery only (since I can enumerate all UCRT headers, include them in the GMF, and I don't want the std module to own anything from the UCRT; this is essentially belt-and-suspenders since everything in the UCRT is already extern "C" but I was advised that it was a good idea to put them in the GMF to be extra sure).

Hope this helps. Learning this stuff was difficult for me since (1) modules are so new and (2) a lot of what has been written about modules has been from a completely clean slate perspective, not from the perspective I needed which was (3) continuing to support classic includes and named modules with the same codebase, and (4) having a classic separately compiled component (that accreted over 20+ years and was a big headache even before modules). I'm very eager to see more library authors explore modularization so the community can learn these techniques (and possibly discover superior ones, I don't pretend that I've found the best strategy for all time).

STL


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk