CONCEPT Cited by 1 source

C++ compilation model (pre-processing & transitive includes)¶

Definition¶

C++ compilation happens per translation unit (TU) — normally one .cpp file + everything it #includes, transitively. In the pre-processing step the compiler literally substitutes #include "foo.h" with the bytes of foo.h, which may itself contain #include "bar.h", and so on. The result is a single mega-file that is then parsed, template-instantiated, optimised, and compiled.

This has a load-bearing consequence for build performance:

C++ build times are roughly proportional to the number of bytes sent to the compiler after pre-processing. (Source: sources/2024-04-27-figma-speeding-up-c-build-times)

If file C #includes B which #includes A, then compiling C processes all of A's bytes, even if C never directly names any symbol from A.

Why it's a scaling problem¶

Two independent failure modes both inflate post-pre-processing bytes:

Unnecessary includes. A header is listed in #include but the file uses nothing from it directly — included only because something it transitively pulled was used (or once was, and isn't now). Removing the direct include shrinks the TU's transitive closure.
Used-but-huge includes. A header is directly used, so it's a correct include, but including it drags in a megabyte-class transitive tree. Fix = concepts/forward-declaration or split the header so the using file can include only the part it needs.

These two modes require different tools — the first is static-analysis over AST + symbol tables (systems/include-what-you-use, systems/diwydu); the second is measurement over the include DAG (systems/includes-py).

Diagnostic signal: bytes/LOC ratio drift¶

Figma's symptom was build-time growth decoupling from code growth: +10% LOC / +50% build time in one year. Concretely, the ratio of compiler-seen bytes to added LOC was growing unboundedly — each new header file's transitive closure kept getting bigger. This is a canary for include-graph dependency bloat even before build times become a top pain point.

Why cache + faster hardware isn't enough¶

Figma's pre-rewrite mitigations:

M1 Max laptops — better serial compile throughput. Build times reverted to original pace within months as bytes kept climbing.
Ccache + remote caching — saves on re-compilation of unchanged TUs, but cold builds (or any TU where any transitive include changed) still process the full mega-file.
concepts/content-addressed-caching (Bazel remote cache) — same constraint: a TU whose inputs change pays the full bytes cost.

None of these attack the byte count itself. Reducing what the compiler has to process is strictly upstream of making it process faster or skipping it when possible.

concepts/forward-declaration — the lever for the "used but huge" include class.
concepts/build-graph — the broader DAG of actions; include graph is a sub-graph of the TU compile node's inputs.
patterns/ci-regression-budget-gate — measure compiled bytes per TU in CI, warn/block on regressions.
patterns/centralized-forward-declarations — Fwd.h-per-directory as a codebase convention.
systems/diwydu, systems/include-what-you-use — "unused include" tooling.
systems/includes-py — transitive-byte measurement tool.

Seen in¶

sources/2024-04-27-figma-speeding-up-c-build-times — canonical articulation of the bytes-proportional model; measured ratio drift as the diagnostic; 31% bytes / 25% build-time reduction from one-shot cleanup of the largest files.