build2 | 0.17.0 Release Notes

These notes provide a more detailed discussion of major new features, including the motivation for implementing them and their usage examples. For the complete list of changes, refer to the Release Announcement or the NEWS files in the individual packages.

The main area of focus in this release was again more advanced functionality for more complex projects. There are also notable improvements in C++20 modules support, including for import std; in Clang and MSVC. On the documentation front, we now have a comprehensive, step-by-step guide for packaging third-party projects for build2.

There are also quite a few smaller features in this release, such as new buildfile value types and functions, including support for JSON. Additionally, a large amount of maintenance work has been done, mostly in the form of handling various corner cases and fixing bugs. For example, there are about a hundred commit messages in this release that start with Fix.

The following sections discuss these and other new features in detail.

A note on backwards compatibility: this release cannot be upgraded to from build2 0.16.0 and has to be installed from scratch.

Also note that build2 0.16.0 has a bug which will prevent it from working with package repositories containing any packages with the build2 version constraint greater than 0.16.0. As a result, we recommend that you upgrade to 0.17.0 as soon as possible. To ease the impact of this bug we will embargo publishing packages with such constraints to cppget.org for one month from the date of this release (so until 18 July 2024).

1Infrastructure
1.1New CI configurations
2Build System
2.1C++20 modules support improvements
2.2JSON buildfile value types
2.3Set and map buildfile value types
2.4New buildfile functions
2.5Embedded development support improvements
3Package Dependency Manager
3.1Advanced CI build functionality
4Project Dependency Manager
4.1Customization improvements in new command

1 Infrastructure

1.1 New CI configurations

The following new build configurations have been added to the CI service:

freebsd_13.3-clang_17
freebsd_14.1-clang_18

linux_debian_12-gcc_14             (x86_64 and aarch64)
linux_debian_12-clang_17[_libc++]  (x86_64 and aarch64)
linux_debian_12-clang_18[_libc++]  (x86_64 and aarch64)

macos_14-clang_15.0                (Xcode 15.3, Apple Clang 1500.3.9.4)
macos_14-gcc_14_homebrew

windows_10-msvc_17.8               (LTSC)
windows_10-msvc_17.10
windows_10-clang_17_msvc_msvc_17.10
windows_10-clang_18_llvm_msvc_17.10
windows_10-gcc_13.2_mingw_w64

Note also that we have removed the Emscripten configuration ( linux_debian_11-emcc_3.1.6) from the all and experimental classes. So if you want your packages to continue building in this configuration, you will need to add it explicitly, for example, using the wasm class or, alternatively, with build-include.

Additionally, the following new configurations have been added to the bindist class:

The bindist class contains configurations that support the generation and upload of binary distribution packages.

linux_ubuntu_22.04-gcc_11-bindist
linux_ubuntu_24.04-gcc_13-bindist
linux_fedora_39-gcc_13-bindist
linux_fedora_40-gcc_14-bindist
linux_rhel_9.2-gcc_11-bindist
macos_13-clang_15.0-bindist
windows_10-msvc_17.8-bindist

All in all, there are now 100 build configurations that cover a wide range of versions of all the major compilers (GCC, Clang, and MSVC) on all the major platforms (Linux, Mac OS, Windows, FreeBSD as well as WebAssembly).

2 Build System

2.1 C++20 modules support improvements

The past year saw substantial improvements in C++20 modules support in Clang and MSVC to the point that modules are becoming usable in real-world projects if recent-enough versions of these compilers are used (17.10 for MSVC and 18 for Clang).

There was also some progress on this front in GCC, but C++20 modules in that compiler for now remain largely unusable in practice.

To match these improvements, the build2 build system has been updated with support for named modules in Clang and MSVC.

Header units remain only supported in GCC, which is currently the only compiler that provides the necessary build system integration mechanisms (in the form of the dynamic module mapper). Note also that in our assessment of the sentiment towards header units, we feel that the C++ community is starting to give up on this feature, especially since the standard library modules are becoming usable (see below).

Besides better support for modules as the language mechanism, both Clang's libc++ and MSVC STL now provide the standard library modules std and std.compat and in build2 we now support their importation, including automatic compilation of the necessary BMIs.

Practically, all this means that you can now write:

import std;

int main ()
{
  std::cout << "Hello, world!" << std::endl;
}

Then enable modules support in your project:

cxx.std = latest
cxx.features.modules = true

using cxx

And everything will work automatically with both Clang with libc++ and MSVC.

If you plan to start using the standard library modules in your project, be aware that mixing importation of std with inclusion of standard library headers in the same translation unit only works in one direction: inclusion first, importation second. For MSVC you will also need to use version 17.10 or later.

To match this new support we have also updated our collection of C++20 module examples with versions that use import std; instead of header inclusion.

Finally, the C++ Modules Support section in the manual has been updated to match the current state of the implementation. It provides a brief introduction to C++20 modules, describes how they are built in build2, and provides guidelines on module design and modularizing existing code.

2.2 JSON buildfile value types

While the set of value types we already have is sufficient for most use-cases, sometimes you need to represent more complex, structured data. To cover such requirements we have added support for JSON as a buildfile value type.

Specifically, there is now the json type which can represent any valid JSON value: null, boolean, number, string, array, or object. Plus there are two specialized types, json_array and json_object, which represent top-level arrays and objects, respectively.

The purpose of these two specialized types is to allow you to "tighten" a variable type in cases where it is expected to contain a top-level array or object.

Accompanying these new types is a number of new functions that operate on them (see JSON Functions for their documentation):

$json.value_type(<json>)
$json.value_size(<json>)

$json.member_name(<json-member>)
$json.member_value(<json-member>)

$json.object_names(<json-object>)

$json.array_size(<json-array>)
$json.array_find(<json-array>, <json>)
$json.array_find_index(<json-array>, <json>)

$json.load(<path>)
$json.parse(<text>)
$json.serialize(<json>[, <indentation>])

There are several ways to construct a value of the json type. We can load it from a file with the $json.load() function or parse it from a JSON input text with the $json.parse() function. For example:

j = $json.load($src_base/board.json)
j = $json.parse('{"one":1, "two":2}')

Or we can construct it in a buildfile:

j = [json] one@1 two@([json] 2 3 4) three@([json] x@1 y@-1)

Here we are using lists like 2 3 4 to represent JSON arrays and lists of @-pairs like x@1 y@-1 to represent objects. Note also that if we need to specify an array or object as an element or member, we need to explicitly cast it to the json type, like ([json] 2 3 4).

This can also be done incrementally with append/prepend:

j = [json_object]
j += one@1
j += two@([json] 2 3 4)
j += three@([json] x@1 y@-1)

If you prefer, you can also specify valid JSON input text instead of using the above JSON-like syntax:

j = [json] '{"one":1, "two":[2, 3, 4], "three":{"x":1, "y":-1}'

The use of @ as a pair separator is certainly unusual. Why couldn't we use something more canonical, such as = or :. The reason is that values that use pairs (either JSON values or out-qualified target names) are not the most common type of values we work with in build2. The most common are target names (which are basically file names) and command line options (compiler options, etc). And command line options routinely use = and : (the latter is commonly found in cl.exe options). Which means that if we were to use one of them as a pair separator, we would have to escape or quote it when used in other, more common contexts.

Besides the above set of functions, other handy ways to access components in a JSON value are iteration and subscript. For example, given the above JSON input, the following:

for m: $j
{
  v = [json] $member_value($m)
  print $member_name($m) $value_type($v) $v
}

Will print:

one number 1
two array [2,3,4]
three object {"x":1,"y":-1}

In the above fragment, the reason we had to explicitly cast the result of $json.member_value() to json is because for simple JSON values (boolean, number, and string), this function returns values of the corresponding simple types rather than json. This is normally what we want but not in this case, where we want to call $json.value_type() and which expects a value of the json type.

To access a specific element of an array or a value of an object member we can use subscript:

print ($j[three])

A subscript can be nested:

print ($j[two][1])
print ($j[three][x])

These three print directives will produce the following output given the above JSON input:

{"x":1,"y":-1}
3
1

While a JSON value can be printed directly like any other value, the representation will not be pretty-printed (you can see an example of this on the first line of the above output). As a result, for complex JSON values, printing a serialized representation may be a more readable option. This can be achieved with the $json.serialize() function. For example, for tracing more complex JSON we would normally do something like this:

info 'value of j is:' $serialize($j)

This will produce the following output given the above JSON input:

buildfile:16:1: info: value of j is: {
  "one": 1,
  "two": [
    2,
    3,
    4
  ],
  "three": {
    "x": 1,
    "y": -1
  }
}

2.3 Set and map buildfile value types

This release also adds a number of set and map value types. Specifically, there is string_set, which is mapped to std::set<std::string> and string_map, which is mapped to std::map<std::string, std::string>.

While we will likely add set and map types with other element types in the future, for now we have added json_set and json_map to cover such use-cases. The two are mapped to std::set<json> and std::map<json, json>, respectively.

Accompanying these new types is a number of new functions:

$size(<set>)
$size(<map>)
$keys(<map>)

The $size() function returns the number of elements in the set or map (the number of key-value pairs in case of a map). The $keys() function return the list of map keys. Note that for json_map the $keys() function returns this list as a JSON array.

For sets the subscript returns true if the value is present and false otherwise (so it is mapped to std::set::contains()). For example:

set = [string_set] a b c

if ($set[b])
  ...

For maps the subscript can be used to look up a value by key (so it is mapped to std::map::find()). The result is [null] if there is no value associated with the specified key. For example:

map = [string_map] a@1 b@2 c@3

b = ($map[b])  # 2

if ($map[z] == [null])
  ...

For json_map, the map subscript can be followed by the JSON array/object subscripts. For example:

map = [json_map] 2@([json] a@1 b@2) 1@([json] 1 2)
set = [json_set] ([json] x@1 y@2) ([json] a@1 b@2)

print ($map[2][b])              # 2
print ($set[([json] y@2 x@1)])  # true

Note that for maps the append (+=) is overriding (like std::map::insert_or_assign()) while the prepend (=+) is not (like std::map::insert()). In a sense, whatever appears last (from left to right) is kept, which is consistent with what we expect to happen when specifying the same key repeatedly in a literal representation. For example:

map = [string_map] a@0 b@2 a@1  # a@1 b@2
map += b@0 c@3                  # a@1 b@0 c@3
map =+ b@1 d@4                  # a@1 b@0 c@3 d@4

For sets both append (+=) and prepend (=+) have the same semantics (std::set::insert()). For example:

set = [string_set] a b
set += c b              # a b c
set =+ d b              # a b c d

An example of the set iteration:

set = [string_set] a b c

for k: $set
   ...

An example of the map iteration:

map = [string_map] a@1 b@2 c@3

for p: $map
{
  k = $first($p)
  v = $second($p)
}

An index-based iteration of maps can be implemented (with a bit of overhead) using the $keys() function:

map = [string_map] a@1 b@2 c@3

keys = $keys($m)

for i: $integer_sequence(0, $size($keys))
{
  k = ($keys[$i])
  v = ($map[$k])
}

2.4 New buildfile functions

This release adds quite a few new functions in several families.

New string functions (see String Functions in the manual for details):

$string.contains()
$string.starts_with()
$string.ends_with()
$string.replace()

New path functions (see Path Functions in the manual for details):

$path.absolute()
$path.simple()
$path.sub_path()
$path.super_path()
$path.complete()
$path.try_normalize()
$path.try_actualize()

New filesystem functions (see Filesystem Functions in the manual for details):

$filesystem.file_exists()
$filesystem.directory_exists()

2.5 Embedded development support improvements

Supporting embedded development in a general-purpose build system is challenging for two main reasons: Firstly, the conceptual model of build may be very different compared to the general-purpose "libraries and executables". Secondly, this model may drastically vary from one embedded target to another.

As a concrete example, consider CHERIoT, a capability-extended RISC-V for IoT platform. Its build model, besides libraries and the firmware image (a final product that is loaded on the device and that is roughly equivalent to an executable), also includes the notion of compartments, out of which the firmware image is assembled.

Because of this diversity, trying to support every or even the most popular embedded targets in the build system core directly is a futile proposition. Instead, in build2, we focus on providing elementary build blocks as well as extra flexibility in the general-purpose mechanisms to allow implementing an embedded target support outside of the build system core.

In this release we have added a large number of such building blocks which allowed us to implement prototype build support for CHERIoT outside of build2 (use README-BUILD2.md as a starting point if you want to give it a try).

Let's look at some specifics to better understand how this works. Support for an embedded target typically comes in the form of a custom C/C++ toolchain plus an SDK and CHERIoT fits this model: it has a Clang-based toolchain and the CHERIoT RTOS SDK. The build2 support naturally fits into the SDK: a CHERIoT project imports the SDK and gets both the source code for the RTOS (plus some extra libraries) as well as the build2 build rules.

The meat of CHERIoT build2 support are the build rules. The basic "build chain" of a firmware image would look like this:

  1. Compile C/C++ source files to object files.
  2. Link object files to compartments.
  3. Link compartments to a firmware image.

While we were able to reuse the general-purpose C/C++ compilation rule from the build2 core, linking of compartments and firmware images is just too different to try to reuse the general-purpose link rule and we had to implement our own.

The first version of these rules were written in Buildscript but eventually we had to re-implement them in C++ to be able to handle more advanced requirements such as implicit dependency synthesis as well as back-propagation of information from targets to prerequisites. In the future we may also move the rules to a build system module to improve user experience even further (no need to build rules for every project).

While this overview already sounds pretty complex, it only scratches the surface. Other than the rules, the prototype also has to deal with:

Our key goal with this prototype was to make sure the user does not need to jump through unnecessary hoops or resort to hacks in order to build their CHERIoT-based project. For example, it would be typical for embedded development to hardcode the path to the SDK or the board name in the project's buildfile. Instead, in build2 we made sure that the standard mechanisms, such as importation and configuration variables, can be used to handle these requirements, just as one would expect in a build2-based project targeting a general-purpose platform. While the lack of hoops and hacks is already a benefit, more importantly, by using the same mechanisms across different styles of development we make sure that knowledge and experience can be applied across projects.

3 Package Dependency Manager

3.1 Advanced CI build functionality

In this release we have added two notable new package manifest values relating to the CI builds.

The first is build-bot. It allows specifying custom build bots (in the form of public keys) that should be used instead of the default bots to build the package.

Custom bots can be used, for example, to accommodate packages that have special requirements, such as proprietary dependencies, and which cannot be fulfilled using the default bots. For example, we use custom bots in ODB to test with proprietary databases such as MS SQL and Oracle.

Another potential use would be for larger projects, such as Boost or Qt, to host their own bots in order to provide better quality of service (that is, faster builds) for their own development.

The second is build-auxiliary. It allows specifying auxiliary configurations that provide additional components which are required for building or testing a package and that are impossible or impractical to provide as part of the build configuration itself. For example, a package may need access to a suitably configured database, such as PostgreSQL, in order to run its tests.

As a pilot, we provide the following auxiliary configurations on the default build bots:

linux_debian_12-mysql_8
linux_debian_12-postgresql_15
linux_debian_12-postgresql_16

As you may have noticed, this feature combines well with custom bots: if your project requires some special setup for testing, you can provide custom bots potentially with your own auxiliary machines.

To give you a concrete example of how everything fits together, in ODB we use custom bots to provide both customized build configurations and our own auxiliary configurations for testing proprietary MS SQL and Oracle databases. Specifically, we need customized build configurations because we need to install proprietary database runtimes (the ODBC driver for MS SQL and the OCI library for Oracle) on the build machines. We also need our own auxiliary configurations that run these proprietary databases for performing tests.

In practical terms, ODB is a good example of what it takes to comprehensively and automatically test a complex, cross-platform C++ project with non-trivial dependencies. With all the combinations of platforms, compilers, databases (ODB supports five: SQLite, PostgreSQL, MySQL, MS SQL, and Oracle), and various modes, we ended up with almost a thousand distinct builds. For some of them (multi-database test that involves all five databases), the CI infrastructure has to run and coordinate five VMs in parallel (the build configuration plus four auxiliaries for each database except SQLite).

4 Project Dependency Manager

4.1 Customization improvements in new command

Work on the new packaging guide resulted in a number of improvements to the bdep-new command. These include:

In particular, the third-party sub-option is meant for converting an existing third-party project to build2. It automatically enables a number of other sub-options (such as no-version, no-readme, and no-symexport). It also adds a number of values to manifest that makes sense to specify in a package of a third-party project and generates the PACKAGE-README.md template.