Modules Build System Scenarios

Introduction

Presented here are several scenarios that are intended to be familiar to those with experience creating, maintaining, and using build systems within the C++ ecosystem. Discussion focuses on problems and possible solutions for supporting C++20 modules within each scenario. Topics discussed include:

  1. How to resolve a module import to either:
    1. A pre-built binary module interface (BMI) that is compatible with the importing tool chain and build configuration.
    2. One or more module interface unit source files that define the required module and from which a BMI can be constructed.
  2. How to build a BMI for a module interface unit source file when a compatible pre-existing BMI is not available.

Tangential topics include:

  1. How to determine if a pre-existing BMI is or is not compatible with the importing tool chain.
  2. How to determine if a pre-existing BMI is or is not compatible with the build configuration (including compatibility with code to be compiled and pre-compiled code linked via an existing object file, static library, or dynamic library).
  3. How to determine a suitable order in which to build BMIs for module interface unit source files.

This paper focuses on scenarios involving module interface units due to the imposed requirement that they be built before source files that import the modules they define can be successfully translated. Module implementation units are discussed only with regard to pre-compiled code and BMI compatibility.

Within this document, "compile" refers to the act of producing object code for a given source file and "build" is used to refer to both 1) construction of a BMI for a given module interface unit, and 2) the suite of compiler or tool invocations initiated to construct all derived artifacts for a given project. "Translation" refers to the act of parsing and semantically analyzing a source file.

For the purposes of this document, please note that a BMI need not be a file persisted on a file system. Rather, a BMI is a semantic representation of a module interface that may be constructed eagerly or lazily, in memory or on disk, locally or remotely. However, it is often convenient to think of a BMI as a file that holds the cached result of a previous build of a module interface unit.

The presented scenarios are intended to apply to all build systems. This includes traditional build systems that invoke compilers to produce executable files as well as the processes that are used by IDEs, static analyzers, or code review tools to find and translate C++ source code for presentation or analytical purposes. C++20 modules will require all tools that purport to accurately parse and semantically analyze C++ code to be able to dynamically navigate a translation unit dependency graph since C++ code that imports a module can no longer be processed as a single translation unit.

Terminology

Scenario 1: In-project dependent modules

In this scenario, a build system is tasked with translating a collection of source files, some of which import a module M that is defined by module interface unit source files that are known to the build system.

Problems

  1. The build system may lack knowledge of which source file(s) define module M.
  2. The build system may lack knowledge of which modules a source file imports.
  3. The build system may lack knowledge of module dependencies and implied module build order requirements.

Solution

  1. Scan all source files within the project to 1) identify source files that define modules, 2) identify which modules each source file imports, and 3) construct a DAG of module dependencies to be used to execute a build plan that ensures that, for each module M, a BMI is built for the module interface unit source file(s) that define M before any source file that imports M is translated. See the clang-scan-deps presentation by Alex Lorenz and Michael Spencer from the April 2019 LLVM Developer's Meeting.

Scenario 2: External dependent source-only modules

In this scenario, a build system is tasked with translating a collection of source files, some of which import a module M that is not defined by a module interface unit source file that is known to the build system. In this case, the build system must follow a protocol that facilitates discovery of such source files and requirements for building a BMI.

Problems

  1. The build system lacks a mechanism to:
    1. Discover the external source file(s) that define module M.
    2. Discover the requirements for building a BMI for module M (language dialect, include paths, macros, etc…)

Solution

  1. Allow for a sequence of library identifiers to be provided to the build system such that each library identifier has an associated path (either well-known to the build system or user-provided) where the source files and other artifacts of the library are located.
  2. Specify a name and format for a library description file to be provided by cooperating libraries that provides the following (note that these configurations may need to vary by the consuming tool chain, target, or build configuration):
    1. Paths relative to the library description file that contain public header files.
    2. Paths relative to the library description file that contain private header files (to be used when building module interface units).
    3. Paths relative to the library description file that contain module interface unit source files (to be used when searching for source files that define modules).
    4. Language dialect requirements for building module interface units.
    5. Macro requirements for building module interface units.
    6. Libraries directly required by the library.
  3. Search for library description files in each of the library search paths. For each library description file found, scan all source files within the relative module interface unit source path(s) as is done in scenario 1.
  4. Once scanning is complete, if any imported modules remain unresolved, issue an error.

Scenario 3: External dependent packaged modules

This scenario is similar to scenario 2, but differs in that the external package might include compatible pre-built BMIs that can be used to 1) obviate the need to build BMIs for module interface unit source files included in the package, and 2) ensure that BMIs are used that are compatible with any object files or static/dynamic libraries linked from the package.

Problems

  1. The package might provide BMIs for numerous tool chains, targets, and build configurations and the build system will have to select the right one.
  2. The package might not provide BMIs that are compatible with the consuming tool chain, target, or build configuration.
  3. The build system and tool chain must collaborate to select BMIs that are compatible with both the tool chain and build configuration (e.g., any object files or static/dynamic libraries to be linked in).
  4. There is no obvious best solution to determine if a BMI is compatible at a build configuration level. Two BMIs that reflect minor declaration differences indicate the presence of ODR differences, but do not necessarily indicate an ABI difference that would be problematic in practice. There is a practical benefit to not being overly strict.

Solution

  1. Extend the library description file from scenario 2 to include additional tool chain, target, and build configuration dependent information including:
    1. Paths relative to the library description file that contain BMIs, object files and static/dynamic libraries.
    2. Maps from module name to the name of a BMI.
    3. Linker options.
  2. Issue warnings/errors if multiple maps are found for the same module name to distinct BMIs (e.g., to ones that do not reflect a consistent interface).
  3. Embed either a BMI or BMI signature for each imported module M when building an object file. At link time, compare BMIs or BMI signatures across link units. Issue warnings/errors if mismatches are found among BMIs that are compatible with the tool chain (ignore those that are not).