PLANETSCALE 2022-01-18

PlanetScale — How our Rails test suite runs in 1 minute on Buildkite¶

Summary¶

Mike Coutermarsh's 2022-01-18 PlanetScale post walks through two concrete interventions that took PlanetScale's Ruby-on-Rails backend test suite from ~12 minutes serially on a developer MacBook Pro to ~1 minute on CI: (1) spread the test suite across 64 parallel worker processes on a 64-core Buildkite agent via Rails + minitest's native parallelize(workers: 64); (2) audit FactoryBot factories to eliminate the common failure mode where setting up one test object implicitly creates a much larger object graph — in PlanetScale's case, up to 8× as many associated rows as the author expected — and then lock the fix in place with assertions on factory object counts so future refactors can't regress the fast-setup property.

The framing is deliberately opinionated about where the performance work should happen: test-suite speed is a CI-level concern, not a local-developer concern. "We never run all of our application's tests in local development. It's not a good use of time and will never be as fast as running them on CI."

Key takeaways¶

Parallelism is cheaper than optimisation — up to a point. The first intervention (go from 2 workers to 64 workers on a 64-core Buildkite agent) dropped the suite from ~12 min to 3-4 min. No test logic changed — just worker count. "This had the biggest impact and is also the easiest step to improve your test suite speed." Canonical wiki framing of worker-count-bounded- by-cores as the first lever.
Past a core-count ceiling, optimisation is unavoidable. Once you've saturated the machine's cores, speedups come only from making individual tests faster. PlanetScale's second intervention (fix factories) took the suite from 3-4 min to ~1 min — the same order of magnitude of improvement as the parallelism step, but achieved by reducing work-per-test rather than distributing work across more workers.
FactoryBot's convenience is also its failure mode. The FactoryBot object- explosion failure: easy-to-declare associations silently cascade into much larger object graphs than intended. "The library makes it so easy to set up relationships between data that it's possible to trigger the creation of more associated objects than you expect." PlanetScale found factories creating up to 8× as many objects as the author intended, and the cost lands in every test that uses the factory.
Diagnosis via debugger + test introspection. To find the explosion, drop into pry immediately after test setup and count the objects live. "We began investigating this by putting a debugger in our tests to stop execution right after the test setup … We found a few surprising places where we were creating up to 8× as many objects as we thought we were."
Encode the fix as an assertion. After fixing the factory, PlanetScale adds a test that asserts the expected object count for every factory — the new assert-factory-object- count pattern. The canonical shape (verbatim from the post):

test "factory doesn't create tons of databases" do
  create(:database)
  assert_equal 1, Database.count
end

"We keep these tests in our models, protecting us from any regressions when making changes to our factories." The assertion is a property of the factory, not of any test that uses it — so one broken factory fails one dedicated test rather than poisoning dozens of downstream tests.

Buildkite's split architecture lets the customer own the core count. Buildkite's customer-owned agent model (agents run on customer-managed EC2 / Kubernetes pods inside the customer VPC, with Buildkite handling the scheduler / UI) means PlanetScale can provision 64-core machines to serve their test suite without asking a vendor to support that instance shape. "Our infrastructure team set us up with some 64 core machines on Buildkite." Canonical wiki datum: customer-owned-agent CI models decouple test-parallelism economics from vendor pricing tiers.
Don't optimise the local path. The CI-parallel-over- local-serial investment rule: "We haven't put much effort here because it's not something our engineers ever run." The developer workflow is single-file / single-test locally ("When working locally, we'll run the tests for the single file we modified, or just a single test at a time"); the full-suite run is a CI concern, not a local concern.

Systems¶

systems/buildkite — PlanetScale's CI control plane + agent fleet. The 64-core agent pool is what makes parallelize(workers: 64) viable; canonical wiki datum now attributes the 2022 PlanetScale deployment alongside the already-canonicalised 2024 Canva deployment.
systems/planetscale — application is Ruby-on-Rails backend API of PlanetScale's control plane, independent of the MySQL/Vitess data plane that is PlanetScale's product.

Concepts¶

concepts/test-parallelism-worker-count — the worker-count knob in parallelised test runners, bounded by host core count.
concepts/factorybot-object-explosion — test-fixture library failure mode where association declarations implicitly create larger object graphs than expected.
concepts/test-feedback-loop — the time from push-commit to full-suite pass/fail signal as a DevEx primitive.

Patterns¶

patterns/assert-factory-object-count — encode expected factory-output object count as its own assertion, keeping the invariant in source so future refactors can't regress it.
patterns/ci-parallel-over-local-serial — invest engineering effort in the CI full-suite path (where parallelism is cheap), not the local full-suite path (which no one runs anyway).

Operational numbers¶

Serial baseline (local): ~12 minutes on a MacBook Pro.
Initial CI parallelisation: 2 workers, some speedup (unquantified).
64-worker CI parallelisation: 3-4 minutes.
Post-factory-audit CI: ~1 minute.
Agent hardware: 64-core Buildkite agents.
FactoryBot over-creation factor: up to 8× the expected object count in a single factory call.

Caveats¶

Narrow scope — this is a Ruby-on-Rails-CI post from PlanetScale's application tier, not a distributed-systems internals post. Included here because it canonicalises the parallelism-then-optimisation-then-invariant-assertion sequence as a CI-engineering pattern, and because the FactoryBot-object-explosion failure mode is a general shape (over-associated fixture factories exist in every test framework, not just Rails).
No per-test cost numbers — the post reports suite-level wall-clock only. The per-test or per-factory cost distribution before and after the audit isn't shared; we don't know if the 8× factor was uniform or concentrated in a few factories.
The 2-worker → 64-worker step is under-documented. The post notes "this gave us some speed gains" without a number for the 2-worker baseline on CI; only the jump to 64 workers is quantified.
Flakiness not discussed. 64-way parallelism typically surfaces latent test flakiness (shared state, DB cleanup ordering, port contention). The post doesn't describe infrastructure PlanetScale added to handle that.
No DB / fixture strategy disclosed. Whether each worker gets its own MySQL database / schema, whether transactional fixtures are used, how database migrations are pre-warmed for each worker — all important CI-parallelism questions — are elided.
Over-parallelism ceiling not characterised. The post stops at 64 workers matching the 64-core agent; no mention of workers ≠ cores heuristics, hyperthreading, or whether IO-bound tests want higher worker count than CPU count.
The fix is engineering-costly. The narrative makes the factory audit sound mechanical ("Solving this was more straightforward once we knew the problem"), but auditing every factory's output graph across a large Rails app is a substantive engineering investment; the post doesn't quantify that cost.
Tier-3 / narrow-scope — this is a CI-engineering post rather than a distributed-systems-at-scale post. It sits in the same genre as Canva's faster CI builds (sources/2024-12-16-canva-faster-ci-builds) but at a smaller scale and with a narrower remediation story.

Source¶

systems/buildkite — split-architecture CI orchestrator; second canonical wiki Seen-in after Canva.
systems/planetscale — PlanetScale-the-company's app tier (as distinct from the MySQL/Vitess data plane product).
concepts/test-parallelism-worker-count — the N-workers lever.
concepts/factorybot-object-explosion — FactoryBot- specific failure mode.
concepts/test-feedback-loop — feedback-cycle time as a DevEx primitive.
patterns/assert-factory-object-count — encode the factory-output-size invariant as its own test.
patterns/ci-parallel-over-local-serial — invest in CI parallelism, not local serial.
companies/planetscale — PlanetScale company page; 2029th PlanetScale first-party ingest under Mike Coutermarsh.