CONCEPT Cited by 1 source
Build graph¶
Definition¶
A build graph is the DAG of declared build/test actions and their dependencies. Nodes are actions (compile-this, link-that, run-this-test); edges are "X needs Y's output as input." Modern build systems (Bazel, Buck, Pants, Nix) operate on the build graph explicitly: they load it from declaration files, analyze it, and schedule actions against workers.
Why "the graph" matters¶
Having the graph as a first-class, analyzable object enables:
- Incremental builds. Only re-run actions whose inputs (transitively) changed — computed as a graph diff (patterns/build-without-the-bytes complements this).
- Parallelism. Actions with no edges between them run concurrently, bounded only by worker count. Canva's critical-path analysis is literally "longest path in the graph."
- Caching. Each node is keyed by its input-subtree digest; a global cache turns the graph into "only compute this node if nobody has ever computed it with these exact inputs" (concepts/content-addressed-caching).
- Remote execution. Node granularity = unit of distribution.
Scale at Canva¶
From the retrospective:
Our build graph has more than 10^5 nodes and some expensive build actions (5–10 minutes), which run often… It didn't help that we had a build graph with over 900K nodes.
The graph-load step is itself an action:
Loading and analyzing the build graph to generate the necessary actions from the
/WORKSPACE,**/BUILD.bazel, and*.bzlfiles, plus the filesystem and cache state, might take minutes, especially when you include many targets.
This is why Canva's experiments included bazel build //... (OOMed because
the graph was too big to load into JVM), bazel build //... - //web/...
(worked), and bounded-scope pipeline generation. The graph's size is itself
a first-order constraint.
Diff-against-graph pattern¶
bazel-diff (Tinder/bazel-diff)
computes a per-target input-hash manifest and compares two commits'
manifests to produce a "targets that need to run" list. Canva's pipeline
v3 pre-computes this manifest on dedicated instances as soon as a commit
is pushed and uploads to S3; CI jobs download the manifest at runtime.
(See patterns/static-pipeline-generation.)
Related¶
- concepts/hermetic-build — every graph node declares its inputs.
- concepts/content-addressed-caching — graph node = cache key granularity.
- concepts/critical-path — longest-path computation over the graph.
- concepts/remote-build-execution — the worker-distribution model for graph actions.
- systems/bazel — the canonical build-graph-as-object build system.
Seen in¶
- sources/2024-12-16-canva-faster-ci-builds — 900K-node graph load time as a first-order cost; bazel-diff as graph-diff-for-incremental-CI.