Canva¶
Canva Engineering blog (https://www.canva.dev/blog/engineering/). Not
yet in raw/_feeds.yaml's core tier list as of 2026-04-21; single-article
ingests so far. Posts skew toward product/design content but include
occasional substantive architecture retrospectives (Creators-payment
counting pipeline, Print Routing engine, search/serving-infra posts,
data-platform work).
Key systems¶
- systems/canva-ci — CI system (Bazel + Buildkite + EC2 +
bazel-remote + TestContainers); 80 min → <30 min PR-to-merge
over 2 years; 900K-node Bazel graph; >1000
check-mergebuilds/workday. - systems/canva-usage-counting — Creators-payment counting pipeline: DynamoDB raw events → Snowflake + DBT dedup/aggregation → S3 + SQS + rate-limited RDS ingester for serving.
- systems/canva-print-routing — Print order routing engine: three-subsystem split (Build+Retrieve / Decision / Traversal); relational source of truth → async-rebuilt per-region graphs in ElastiCache/Redis; modified successive-shortest-path min-cost-flow with rule-ordered Decision engine; per-order routing log for explainability. p99 50 ms peak, 99.999% availability.
Key patterns / concepts¶
- patterns/build-without-the-bytes — Bazel
--remote-download-minimal+ retry-on-eviction cut BE builds 2× and ML builds 3.3×. - patterns/pipeline-step-consolidation — BE/ML Pipeline v2: 45 → 16 steps, ~50 % build-minutes cut.
- patterns/static-pipeline-generation — pipeline generation off the critical path: >10 min → ~0 via static YAML + S3-backed bazel-diff manifest.
- patterns/instance-shape-right-sizing — i4i.8xlarge for I/O-heavy Bazel builds (3 h → 15 min); c6id.12xlarge rebalance (-2 to -6 min).
- patterns/snapshot-based-warmup — EBS-snapshot CI-agent warm-ups; P95 wait 40 → 10 min.
- patterns/shaping-vs-building — Canva PDP discipline of separating exploration from production engineering.
- patterns/end-to-end-recompute — rerun-the-pipeline as the recovery story, enabled by outer-join overwrite at the sink.
- patterns/warehouse-unload-bridge — OLAP → OLTP export via S3 + SQS + rate-limited ingester.
- patterns/async-projected-read-model — async-rebuilt per-region graph as the serving tier (Print Routing).
- patterns/deterministic-rule-ordering — ordered rule list + real-valued terminal tie-breaker = same-input-same-output routing.
- patterns/explainability-log — per-order structured decision log for post-hoc "why was it routed this way?" questions.
- patterns/pluggable-component-architecture — Build/Decision/ Traversal subsystems with narrow contracts; independent evolution and A/B testing.
- concepts/hermetic-build — the CI precondition for caching, parallelism, reproducibility (TestContainers rollout).
- concepts/content-addressed-caching — bazel-remote + S3 as the Canva CI cache.
- concepts/critical-path — the bounding metric for CI time; moving target across iterations.
- concepts/first-principles-theoretical-limit — 20-min floor vs 3-h observed framing of the CI project.
- concepts/build-graph — 900K-node Bazel DAG as a first-order CI cost.
- concepts/remote-build-execution — partial rollout: 200 % faster TS builds, cost win on BE compile+pack.
- concepts/oltp-vs-olap — the archetype mismatch that motivated the billing migration.
- concepts/elt-vs-etl — the ELT re-platform with DBT on Snowflake.
- concepts/compute-storage-separation — the Snowflake property that made end-to-end recompute feasible.
- concepts/cqrs — same shape as Print Routing: relational write-side + async-projected graph read-side.
- concepts/min-cost-flow — Print Routing's underlying algorithm (modified successive shortest path).
- concepts/geographic-sharding — Print Routing shards graphs by destination region; retrieval + traversal stay fast.
Recent articles¶
- 2024-12-16 — sources/2024-12-16-canva-faster-ci-builds (Faster CI builds: 80 min → <30 min PR-to-merge; Bazel migration pain points; build-without-the-bytes; hermetic TestContainers; pipeline step consolidation; static pipeline generation; instance-shape right-sizing; EBS-snapshot agent warm-up; shaping-vs-building discipline; per-test runtime caps).
- 2024-12-10 — sources/2024-12-10-canva-routing-print-orders (Print order routing engine: three-subsystem architecture; relational ops DB → async-rebuilt per-region graphs in Redis; min-cost-flow traversal with pairwise path ranking + "regrets" backtracking; deterministic rule-ordered Decision engine with distance tie-breaker; per-order routing logs for explainability; p50 ~30 ms, p99 ~50 ms peak, 99.999% availability).
- 2024-04-29 — sources/2024-04-29-canva-scaling-to-count-billions (Creators-payment counting: MySQL → DynamoDB raw events → OLAP+ELT on Snowflake + DBT; billions/month aggregated in minutes; pipeline latency >1 day → <1 hour; >50% storage reduction; incident rate dropped from ≥1/month to ~1 every few months).
Ingest posture¶
Apply Tier 3 filter: skip pure product/design posts, marketing, and feature announcements; ingest when an article covers distributed systems internals, scaling trade-offs, data-platform or search infrastructure architecture, or production-incident retrospectives.