Complete C++20 Modules Support with GCC

Posted on 16 Feb 2021 by Boris Kolpackov with comments on r/cpp/

TL;DR: build2 now provides conforming and scalable support for all the major C++20 Modules features when used with GCC. This includes named modules, module partitions (both interface and implementation), header unit importation, and include translation. All of these features are also supported in libraries, including consumption of installed libraries with information about modules and importable headers conveyed in pkg-config files. As part of this effort we have also created a collection of examples that demonstrate C++20 Modules features that impact the build process.

Before going into detail on each of these features, let's clarify what we mean by conforming and scalable. With GCC (and soon with Clang), build2 uses the module mapper mechanism to discover module and header unit dependencies and communicate their mapping to the compiler. As a result, build2 does not place any restrictions on the structure of translation units or module features used. In other words, it is able to handle any conforming C++ module or header unit.

Similarly, build2 does not place on the user any modules support burden that would not scale to general, real-world projects and use-cases. Specifically, it does not expect the user to manually specify and maintain module name to file name mapping, their dependency information, or arrange pre-building of any modules or header units, including of system headers. In particular, build2 will automatically build header units if and when imported or include-translated.

While build2 support for modules is complete and reasonably reliable, at the time of writing the same cannot be said about GCC. As a result, until the situation improves, the implementation is meant more for early experimentation rather than production use. See How to Try for details.

One build2 feature that is not yet handled by the module mapper is support for auto-generated headers. However, this support has already been successfully prototyped during the early modules development in GCC and we hope to revive and complete it in the near future (see P1842 for details).

1Named Modules
2Module Partitions
3Header Units
4Include Translation
5Modules and Libraries
6How to Try

1 Named Modules

See cxx20-modules-examples/hello-module for the complete example.

If you are new to modules, see Practical C++ Modules for an introduction.

Building named modules with build2 is fairly straightforward. Here is a minimal example consisting of three files, hello.mxx, main.cxx, and buildfile:

// hello.mxx

export module hello;

import <string_view>;
import <iostream>;

export namespace hello
{
  void
  say_hello (const std::string_view& name)
  {
    std::cout << "Hello, " << name << '!' << std::endl;
  }
}
// main.cxx

import hello;

int
main ()
{
  hello::say_hello ("World");
}
# buildfile

cxx.std = experimental
cxx.features.modules = true

using cxx

assert $cxx.features.modules "no modules support for $cxx.signature"

mxx{*}: extension = mxx
cxx{*}: extension = cxx

exe{hello}: {cxx mxx}{*}

If you place all three files into some directory, you should be able to build it like this (see How to Try for details):

$ b config.cxx=g++-11
c++ .../include/c++/11.0.0/h{string_view}
c++ .../include/c++/11.0.0/h{iostream}
c++ mxx{hello}
c++ cxx{main}
ld exe{hello}

$ ./hello
Hello, World!

As you might have noticed, we used a different extension (.mxx) for our module interface (hello.mxx) compared to other source files (for example, main.cxx). We recommend that you follow this practice (using the .mpp extension if your source files use .cpp). If, however, you would like to use the same extension for all translation units, you can, you will just have to sort them out in your buildfiles, for example (after renaming hello.mxx to hello.cxx):

mxx{*}: extension = cxx
cxx{*}: extension = cxx

exe{hello}: mxx{hello} cxx{main}

If you compare the above three files to the hello-module example, you will notice that the latter has quite a few more files and some internal structure. The reason is that hello-module is a package with a structure more suitable for real-world projects. In contrast, the above three files are a simple build system-only project more suitable for quick tests and prototypes.

Because modules support in GCC is currently experimental and incomplete, we have to explicitly force this feature either with the following in our buildfile:

cxx.std = experimental
cxx.features.modules = true

Or by overriding on the command line:

$ b config.cxx.std=experimental config.cxx.features.modules=true ...

2 Module Partitions

See cxx20-modules-examples/hello-partition for the complete example.

There is not much difference in dealing with module partitions compared to primary module interfaces: both interface and implementation partitions should use the same extension as module interfaces (or otherwise be explicitly listed as mxx{} targets, as shown above). See the hello-partition example which shows the use of both interface and implementation partitions.

3 Header Units

See cxx20-modules-examples/hello-header-import for the complete example.

Importing header units is straightforward: simply replace #include directives with import declarations and hope the compiler can handle it. In fact, we've already used header units in our named module example above. Here is another example:

// main.cxx

import <iostream>;

int
main ()
{
  std::cout << "Hello, World!" << std::endl;
}
# buildfile

cxx.std = experimental
cxx.features.modules = true

using cxx

assert $cxx.features.modules "no modules support for $cxx.signature"

cxx{*}: extension = cxx

exe{hello}: cxx{*}
$ b config.cxx=g++-11
c++ .../include/c++/11.0.0/h{iostream}
c++ cxx{main}
ld exe{hello}

Naturally, importation is not limited to standard library headers and we can also import our project's own headers as demonstrated in the hello-header-import example.

4 Include Translation

See cxx20-modules-examples/hello-header-translate for the complete example.

Instead of manually replacing #include directives with import declarations in our source code we can translate them on the fly. To control which headers are translated we use the config.cxx.translate_include configuration variable. For example, we can try to translate all the importable headers:

$ b config.cxx=g++-11 config.cxx.translate_include=all-importable

Since include translation is non-intrusive, you can try it on any project that uses build2 by forcing modules support and enabling include translation, for example:

$ b config.cxx=g++-11 \
  config.cxx.std=experimental \
  config.cxx.features.modules=true \
  config.cxx.translate_include=all-importable

Which headers are importable? The C++20 standard specifies that all the C++ standard library headers (but not the C library wrappers) are importable. For example, this is how we can translate only importable standard library headers:

config.cxx.translate_include=std-importable

We can also mark our own headers as importable with the cxx.importable variable, for example:

# buildfile

...

# Assume headers are importable unless stated otherwise.
#
hxx{*}: cxx.importable = true

There could also be third-party headers that are not explicitly marked as importable. To handle this, we can request the translation of specific headers or header wildcard patterns, for example:

config.cxx.translate_include='<iostream> "<boost/container/*.hpp>"'

We can also disable the translation of specific headers or header groups, for example:

config.cxx.translate_include='all-importable std-importable@false'
config.cxx.translate_include='std-importable <iostream>@false'

You can also attempt to translate all headers by specifying the all header group. This is primarily useful for experimentation, for example, to discover which headers would break the build if translated.

Finally, if we need to adjust or disable include translation for our project or for specific translation units in our project, we use the cxx.translate_include variable, for example:

# buildfile

cxx.translate_include += <iostream>@false

obj{main}: cxx.translate_include = # Don't translate anything.

5 Modules and Libraries

See cxx20-modules-examples/hello-library-module and cxx20-modules-examples/hello-library-header for the complete examples.

Building modules as part of a library does not pose any additional complications. Here is how we can convert our example from the Named Modules section to build a library:

# buildfile

...

exe{hello}: cxx{main} lib{hello}
lib{hello}: {cxx mxx}{* -main}

However, installing libraries with modules and then consuming the result in other projects (and potentially with other build systems) requires additional mechanisms. Specifically, the installation needs to convey which modules are provided by the library and the location of their module interface source files (the consuming build system also needs to arrange for the compilation of these interfaces on the side).

Instead of inventing yet another file format, we choose to reuse the pkg-config files that build2 automatically produces for every library it installs. Specifically, the pkg-config format supports free variables which is what we use to convey the modules information. For example, this is the relevant fragment from the libhello.pc file in the hello-library-module example (.. in hello..check encodes the : module partition separator):

cxx.modules = \
  hello..check=/usr/include/libhello/check.mxx \
  hello=/usr/include/libhello/hello.mxx

Naturally, the same mechanism is used to convey which library headers are importable (from the hello-library-header example):

cxx.importable_headers = \
  /usr/include/libhello/check.hxx \
  /usr/include/libhello/hello.hxx

What happens to a header-only library in a modular world? If we replace all the headers with module interfaces, then we end up with a module interface-only library. Unlike with headers, however, the fact that such a library only contains module interfaces does not mean it's binless (that is, without a library binary) since in general compiling a module interface produces an object file that must be linked to the module consumer. As a result, if we want our module interface-only library to be binless, then we must explicitly mark it as such with the bin.binless variable. For example, we can make our lib{hello} from above (which only contains hello.mxx) binless:

# buildfile

...

lib{hello}: mxx{*}
{
  bin.binless = true
}

Note that unlike a header-only library, a binless module interface-only library can define non-inline/template functions and variables. So now we can have our cake and eat it too (but perhaps we shouldn't indulge too much).

6 How to Try

At the time of writing, to try these examples, you will need the latest staged version of the build2 toolchain as well as the latest GCC master build.

If you would like to see what's going on under the hood, you can increase the verbosity level to -v or -V (the latter will include the mapper interactions). In this case it may also make sense to run the build system serially to keep the output comprehensible, for example:

$ b -sV ...

If you run into any issues with build2, see Getting Help for various ways to get assistance or report bugs. You should also be prepared to encounter compiler bugs which we encourage you to report to the GCC bug database. Start the bug summary with [modules] to mark such bugs as related to C++20 modules. You can also see the current list of such bugs.