Skip to content

CONCEPT Cited by 1 source

C++ compilation model (pre-processing & transitive includes)

Definition

C++ compilation happens per translation unit (TU) — normally one .cpp file + everything it #includes, transitively. In the pre-processing step the compiler literally substitutes #include "foo.h" with the bytes of foo.h, which may itself contain #include "bar.h", and so on. The result is a single mega-file that is then parsed, template-instantiated, optimised, and compiled.

This has a load-bearing consequence for build performance:

C++ build times are roughly proportional to the number of bytes sent to the compiler after pre-processing. (Source: sources/2024-04-27-figma-speeding-up-c-build-times)

If file C #includes B which #includes A, then compiling C processes all of A's bytes, even if C never directly names any symbol from A.

Why it's a scaling problem

Two independent failure modes both inflate post-pre-processing bytes:

  1. Unnecessary includes. A header is listed in #include but the file uses nothing from it directly — included only because something it transitively pulled was used (or once was, and isn't now). Removing the direct include shrinks the TU's transitive closure.
  2. Used-but-huge includes. A header is directly used, so it's a correct include, but including it drags in a megabyte-class transitive tree. Fix = concepts/forward-declaration or split the header so the using file can include only the part it needs.

These two modes require different tools — the first is static-analysis over AST + symbol tables (systems/include-what-you-use, systems/diwydu); the second is measurement over the include DAG (systems/includes-py).

Diagnostic signal: bytes/LOC ratio drift

Figma's symptom was build-time growth decoupling from code growth: +10% LOC / +50% build time in one year. Concretely, the ratio of compiler-seen bytes to added LOC was growing unboundedly — each new header file's transitive closure kept getting bigger. This is a canary for include-graph dependency bloat even before build times become a top pain point.

Why cache + faster hardware isn't enough

Figma's pre-rewrite mitigations:

  • M1 Max laptops — better serial compile throughput. Build times reverted to original pace within months as bytes kept climbing.
  • Ccache + remote caching — saves on re-compilation of unchanged TUs, but cold builds (or any TU where any transitive include changed) still process the full mega-file.
  • concepts/content-addressed-caching (Bazel remote cache) — same constraint: a TU whose inputs change pays the full bytes cost.

None of these attack the byte count itself. Reducing what the compiler has to process is strictly upstream of making it process faster or skipping it when possible.

Seen in

Last updated · 200 distilled / 1,178 read