CONCEPT Cited by 1 source

Build graph¶

Definition¶

A build graph is the DAG of declared build/test actions and their dependencies. Nodes are actions (compile-this, link-that, run-this-test); edges are "X needs Y's output as input." Modern build systems (Bazel, Buck, Pants, Nix) operate on the build graph explicitly: they load it from declaration files, analyze it, and schedule actions against workers.

Why "the graph" matters¶

Having the graph as a first-class, analyzable object enables:

Incremental builds. Only re-run actions whose inputs (transitively) changed — computed as a graph diff (patterns/build-without-the-bytes complements this).
Parallelism. Actions with no edges between them run concurrently, bounded only by worker count. Canva's critical-path analysis is literally "longest path in the graph."
Caching. Each node is keyed by its input-subtree digest; a global cache turns the graph into "only compute this node if nobody has ever computed it with these exact inputs" (concepts/content-addressed-caching).
Remote execution. Node granularity = unit of distribution.

Scale at Canva¶

From the retrospective:

Our build graph has more than 10^5 nodes and some expensive build actions (5–10 minutes), which run often… It didn't help that we had a build graph with over 900K nodes.

The graph-load step is itself an action:

Loading and analyzing the build graph to generate the necessary actions from the /WORKSPACE, **/BUILD.bazel, and *.bzl files, plus the filesystem and cache state, might take minutes, especially when you include many targets.

This is why Canva's experiments included bazel build //... (OOMed because the graph was too big to load into JVM), bazel build //... - //web/... (worked), and bounded-scope pipeline generation. The graph's size is itself a first-order constraint.

Diff-against-graph pattern¶

bazel-diff (Tinder/bazel-diff) computes a per-target input-hash manifest and compares two commits' manifests to produce a "targets that need to run" list. Canva's pipeline v3 pre-computes this manifest on dedicated instances as soon as a commit is pushed and uploads to S3; CI jobs download the manifest at runtime. (See patterns/static-pipeline-generation.)

concepts/hermetic-build — every graph node declares its inputs.
concepts/content-addressed-caching — graph node = cache key granularity.
concepts/critical-path — longest-path computation over the graph.
concepts/remote-build-execution — the worker-distribution model for graph actions.
systems/bazel — the canonical build-graph-as-object build system.

Seen in¶

sources/2024-12-16-canva-faster-ci-builds — 900K-node graph load time as a first-order cost; bazel-diff as graph-diff-for-incremental-CI.