CONCEPT Cited by 1 source

First-principles theoretical limit¶

Definition¶

First-principles theoretical-limit reasoning asks: given the physics of the hardware and the structure of the workload, what's the floor on wall-clock time, ignoring the current implementation? The answer sets the diagnostic ceiling — the gap between "floor" and "observed" is the opportunity surface.

Distinct from "best practice," "benchmark against peers," or "prior-version +5%." Instead: compute the floor, observe the ceiling, attack the gap.

Canva's application¶

From the Canva CI retrospective:

From first principles, we knew that:

Modern computers are incredibly fast.

PR changes are relatively small (a couple hundred lines on average).

One build or test action shouldn't take more than a few minutes on a modern computer, and the critical path shouldn't have more than 2 long actions dependent on each other.

So, if we assume a few minutes = 10 and multiply that by 2 (2 actions dependent on each other, each taking 10 minutes), we have a theoretical limit of approximately 20 minutes for the worst-case build scenario. However, we had builds taking up to 3 hours! What was causing this massive difference?

That ~10× gap (20 min floor vs. 3 h observed) framed every subsequent investigation.

Diagnostic loop¶

Compute the floor. What does the workload have to do? What hardware is available? What's the critical path's theoretical length?
Measure the ceiling. What does it actually take, P50 / P90 / P99?
Instrument the gap. Where does the wall-clock go? (I/O, CPU, serialization, queueing, warm-up.) Canva used a 448-CPU / 6 TB RAM instance as a diagnostic instrument to separate "single-CPU-critical-path-bound" from "distributed-system-bound".
Attack the biggest component. Close some of the gap. Recompute floor and ceiling — they've both probably moved.

If the floor is set by a single-thread action, scaling out doesn't help. Need to speed up the action itself, or parallelize inside it.
If the floor is set by data movement, reduce it (BwoB; patterns/build-without-the-bytes).
If the floor is set by queueing, reduce queue layers (concepts/queueing-theory).
If the observed is orders of magnitude above floor, start with the cheapest structural fix (step consolidation, caching) before deep tuning.

Contrast: benchmark-against-peers¶

The floor framing is stricter than "Netflix does it in 30 min so we should too." Peer benchmarks set a ceiling based on someone else's implementation, not on physics. A team stuck inside a comfortable peer envelope can miss a 10× opportunity that first-principles would surface.

concepts/critical-path — what the floor math is usually computing against.
concepts/queueing-theory — a frequent source of the observed-to-floor gap.
concepts/hard-drive-physics — a canonical floor: 120 IOPS/HDD since 2006; S3 designs against this floor, not against peer-benchmark IOPS.

Seen in¶

sources/2024-12-16-canva-faster-ci-builds — "20 min floor vs 3 h observed" framed the whole multi-year CI project.
sources/2025-02-25-allthingsdistributed-building-and-operating-s3 — HDD floor (~120 IOPS) is the constraint S3 designs around.