Skip to content

PATTERN Cited by 1 source

Declarative load-test conductor

What this is

Declarative load-test conductor is the pattern of building a dedicated, long-lived microservice that owns the complete lifecycle of a load test — deploy production versions, scale applications, drive the load generator, scale back down, clean up — and exposes that capability via a single declarative API where the client describes the target state of a load test (target KPI, ramp-up, plateau duration, apps in scope) rather than the imperative steps.

The pattern generalises Zalando's Load Test Conductor (Source: sources/2021-03-01-zalando-building-an-end-to-end-load-test-automation-system-on-top-of-kubernetes).

The canonical shape

The conductor owns five named phases, executed per load-test run:

  1. Deploy production versions into the test cluster. Use the CI/CD platform's API to find the currently-deployed production artifacts, trigger test-cluster deployments, wait for rollout. (See concepts/production-version-cloning-for-load-test.)
  2. Scale applications to production's replica count + resource allocation. Support multiple substrates simultaneously (Zalando: Kubernetes + AWS ECS).
  3. Generate load via a distributed traffic tool, steered by a KPI-driven closed-loop algorithm against a business KPI target.
  4. Scale back down to the pre-test state as a cost mitigation.
  5. Clean up test data (delete simulator-generated orders, payments, audit records, Nakadi events).

The declarative contract

Client's request:

targetOrdersPerMinute: <N>
rampUpMinutes: <N>
plateauMinutes: <N>
applications: [<svc-a>, <svc-b>, ...]
useProductionVersions: true | false   # feature-branch exception

The conductor figures out the rest.

Required subsystems

  • Deployer — CI/CD platform client, version-discovery, test-cluster deployment driver.
  • Scaler — multi-substrate scaler (Kubernetes API + AWS ECS API + ...). Captures pre-test state to revert.
  • Load generator driver — polls the traffic tool's API; runs the KPI closed-loop algorithm; pushes hatch-rate / user-count updates.
  • Cleanup — knows which downstream state the simulator creates and how to delete it.
  • API + scheduler — single declarative endpoint; also accepts a Kubernetes CronJob trigger hitting the same endpoint.

Why a microservice, not a script

  • Lifecycle state has to persist across retries. Scale-up phase of one run can outlive a single-shot Jenkins job.
  • Multi-substrate orchestration is stateful. Reverting an ECS service's desired count requires remembering what it was before the test.
  • Concurrent-run guarding. A long-running service can prevent two load tests colliding; a script cannot.
  • API surface versus CLI surface. Developers + a CronJob + a Slackbot + CI all want the same capability; an HTTP API serves all of them uniformly.
  • Observability ownership. The conductor becomes the authoritative record of what happened on each run.

The invocation paths (all hit the same API)

  • Manual developer trigger — curl / UI button, typically with useProductionVersions: false against a feature branch.
  • Kubernetes CronJob — for scheduled regression runs (see patterns/scheduled-cron-triggered-load-test).
  • Pre-release gate — CI step before production rollout, also via the same API.

Relation to other patterns

When to apply

  • Microservices landscape of meaningful size (Zalando: 1,122 in-scope apps out of 4,000+ total). The parity + orchestration burden is too high for scripts.
  • Multiple deployment substrates that must be scaled in lockstep (Zalando: Kubernetes + AWS ECS).
  • A recurring forcing function like Cyber Week that funds the investment. See patterns/annual-peak-event-as-capability-forcing-function.

When not to

  • Monolith or small microservice footprint. Build-vs-buy favours a simpler scripted harness.
  • No pre-production environment with production parity. The conductor only adds value if the cluster it drives is close enough to prod that results transfer.
  • No business KPI that can anchor the load shape. The closed-loop ramp-up is what gives this pattern teeth; without a target KPI, simpler ramp schedules suffice.

Operational friction (honest)

  • Unrelated production deploys during a test can race with the Scaler's version snapshot. Zalando names this as unsolved in the post.
  • Infrastructure-parity work is manual — the conductor handles application layer; databases, node types, shared event buses require cross-team negotiation.
  • Evaluation is often manual — Zalando notes the pass/fail call is read by a human from Grafana.

Seen in

Last updated · 476 distilled / 1,218 read