Skip to content

SYSTEM Cited by 1 source

includes.py (Figma transitive-byte CI counter)

What it is

includes.py is Figma's internal CI tool that measures the number of bytes a given C++ source file sends to the compiler after pre-processing, and flags per-PR regressions in that number. It sits alongside systems/diwydu in Figma's build-time defense stack, targeting the failure mode DIWYDU cannot see: a used include whose transitive closure is huge.

Architecture

  • Pure Python. Does not invoke Clang. The entire cost model is "count bytes, follow #include directives." This is what makes it fast enough to run in CI on every PR — "usually in just a couple of seconds."
  • Crawl. Walks all first-party header, source, and generated files in the codebase, recording byte counts and raw #include text.
  • Parse. For each file, extracts #include directives and resolves header paths to other first-party files.
  • Standard-library assumption. Standard-library headers are counted as 0 bytes. The article is explicit that this is a safe assumption for Figma specifically: stdlib usage is gated to one directory of wrappers. Any project that uses the STL directly throughout would need a different assumption.
  • Build the DAG. Constructs the include graph from the parsed edges.
  • Sum transitive bytes. For each source file, the reported cost is file_bytes + sum(transitive_closure_bytes).

How it's used in CI

includes.py runs in Figma's CI on every PR and reports, per source file, the byte delta introduced by the change. Significant regressions produce a warning that blocks merge until addressed.

This is the canonical instantiation of a patterns/ci-regression-budget-gate in the build-performance domain — the budget is post-pre-processing bytes per translation unit, measured cheaply, surfaced before the cost materialises as slow builds.

The engineer's options on a flagged PR:

  1. Drop the offending #include if unused → same as the DIWYDU fix.
  2. Replace with a concepts/forward-declaration if the symbol is only referenced in a way that doesn't need the full type.
  3. Split the header so the using file includes only the part it needs.

The article notes Figma most often recommends option (2), and for that case developed Fwd.h per-directory so individual engineers rarely need to write forward declarations by hand.

What makes it work

  • Cheap. Parsing Clang-style would be too slow for per-PR CI. Plain text #include extraction + byte counting runs in seconds.
  • Right cost model. Because C++ build time is approximately proportional to post-pre-processing bytes (concepts/c-plus-plus-compilation-model), a bytes-delta gate is a direct proxy for build-time-delta — measurable, reproducible, not a lag indicator.
  • Pre-merge gate, not post-merge lament. The article's most striking operational claim:

With these tools, we've been able to automatically identify and quickly rectify these regressions.

And: 50-100 potential slowdowns per day are caught at PR time that would otherwise have landed in master. Without the gate, each regression would have had to be discovered via a build-time complaint weeks later, bisected, and reverted.

Limitations

  • Standard-library 0-byte assumption is unsound outside Figma-shaped codebases.
  • Generated files are crawled but rely on the generator having run at crawl time.
  • Macros that expand to #include aren't handled (not explicitly called out, but implicit in "pure-Python, no Clang").
  • No symbol-level tracking — can't tell you which include is the one driving the regression, only that the total went up. Engineers still have to diagnose.

Seen in

Last updated · 200 distilled / 1,178 read