Skip to content

CONCEPT Cited by 1 source

Declarative load-test API

Definition

A declarative load-test API is one where the client describes the desired end state of a load-test run — target load (e.g., orders-per-minute), duration, ramp-up + plateau times, applications in scope — and an orchestrator reconciles the system to that state. The client does not describe imperative steps like "deploy these versions", "set hatch rate to N", "scale to M replicas" one by one.

Why it matters

Imperative load-test scripts have to encode every transition that's required to reach a given test state — deploy, scale, ramp, plateau, cool down, scale back, clean up. This couples the client to the orchestrator's internals, produces fragile long scripts, and makes the same test hard to rerun reliably.

Declarative APIs push those transitions into an orchestrator that owns the reconciliation. The client's contract is small and stable (the target state + a few parameters). The orchestrator can change the internal phases without breaking clients.

The Kubernetes inspiration

Zalando's Load Test Conductor authors are explicit about the inspiration (Source: sources/2021-03-01-zalando-building-an-end-to-end-load-test-automation-system-on-top-of-kubernetes):

"Our service design was heavily influenced by what Kubernetes popularized for infrastructure management. We wanted our system to be a declarative system. Therefore, the service provides a simple API that can be used by engineers to run load tests by defining the desired state of load test. Executing a load test is now just one API call away!"

The mapping is direct: a load-test run is an object with a desired spec (target QPS, ramp, plateau, apps in scope); the orchestrator is the reconciler. Multiple invocation paths (developer API, Kubernetes CronJob) all produce the same declarative request.

The contract: what's in / what's out

In the client's declaration:

  • Target load metric. In Zalando's case, orders-per-minute (a business-level KPI, not a technical unit like RPS).
  • Ramp-up duration — how long to grow to target.
  • Plateau duration — how long to hold at target.
  • Applications in scope — an environment-specific configuration.
  • Optional: feature branch override. Disable the "deploy production versions" phase and use the developer's branch instead.

Owned by the orchestrator (not declared by the client):

  • Which versions to deploy.
  • How to scale (Kubernetes replica count vs ECS desired-count vs node pool resizing).
  • How to ramp Locust workers (the KPI-closed-loop algorithm is internal).
  • Post-test cleanup (order deletion, test-account reset).

What this enables

  • API-driven + scheduled invocation sharing the same contract. Both an operator and a Kubernetes CronJob hit the same API endpoint. No branching code paths.
  • Multi-substrate orchestration hidden from the client. The client doesn't know the conductor scales Kubernetes and ECS simultaneously.
  • Reusable test runs. The same declarative spec produces the same orchestration actions; reruns are idempotent (modulo state).
  • Platform evolution without client churn. Changes to the conductor's internal phases — new parity checks, new cleanup steps, new substrates — don't require client updates.

Trade-offs

  • Harder to debug when reconciliation fails. An imperative script's failure point is named in the script; a declarative API's failure is encoded in the orchestrator's internal state.
  • Harder to express one-off variations. "Do the same test but skip the scale-up this time" is awkward in a declarative API; trivial in imperative.
  • Orchestrator surface is a shared dependency. All clients are coupled to the orchestrator's supported fields.

Seen in

Last updated · 476 distilled / 1,218 read