Skip to content

PATTERN Cited by 1 source

Parallel run pattern

Definition

The parallel run pattern (Sam Newman, Monolith to Microservices) is a migration technique where, instead of switching traffic from the old implementation to the new one, both implementations are called for every request, their outputs are compared, and only one is considered the source of truth at any given time — typically the old — until verified behavioural equivalence earns the new implementation trust. When the new implementation matches the old one to an acceptable threshold, traffic is cut over.

Shape

  1. Both implementations receive the same request. Zalando's variant responds from the monolith on the hot path to keep client latency unaffected.
  2. Only one is authoritative. The old system's response is what the client sees. The new system's response is compared but discarded.
  3. Comparison along defined axes — typically HTTP status, headers, body for API-level parallel runs (see concepts/response-comparison-headers-body-status).
  4. Metrics per endpoint / per function — Matched / Unmatched / Failed counters keyed by some discriminator (Zalando uses operation_id) so readiness per slice is measurable.
  5. Readiness threshold per endpoint — each endpoint has its own target consistency percentage (see concepts/consistency-threshold-per-endpoint); once hit, that endpoint cuts over.
  6. Cutover is reversible — proxy-rule swap, not a redeploy; see patterns/gradual-per-endpoint-cutover.
  7. Cleanup — remove comparison scaffolding after cutover (Zalando: ~700 LOC + 1.3k test LOC removed).

When to use

  • Legacy system behaviour is under-specified. Tests don't cover the edge cases; dark corners exist that even the current maintainers can't predict.
  • High-risk migration. Financial / identity / payment / regulatory / core-business paths where getting it wrong costs real money or trust.
  • Downtime is not an option. Big-bang cutover too risky.
  • The new implementation is meant to be behaviourally equivalent. Parallel run is an equivalence-verification pattern, not a refactor-with-new-semantics pattern.
  • Idempotent requests or read-only paths. See limits below.

When NOT to use

  • Endpoints with side effects that can't be double-executed. POST/PATCH/DELETE that writes to a DB, publishes an event, calls a downstream mutating API — see concepts/non-idempotent-endpoint-parallel-run-constraint.
  • Low-risk migration where integration tests suffice. Parallel run's setup cost doesn't pay back if test coverage already gives enough confidence.
  • Cost-sensitive migration with no budget for ~2× load. Zalando calls the cost "potentially doubling" (see concepts/parallel-run-request-doubling).
  • The new implementation is meant to be different. If the migration introduces new behaviour (e.g., a bug fix in the new system), response comparison will always return Unmatched and the pattern can't separate signal from intended divergence.
  • Timeline pressure. Sam Newman quoted: "Implementing a parallel run is rarely a trivial affair."

Zalando Returns-service instance

Zalando's Returns team extracted returns logic from a monolithic application into a new Returns microservice using this pattern (Source: sources/2021-11-03-zalando-parallel-run-pattern-a-migration-technique-in-microservices):

  • Hot-path asymmetry: monolith responds to client, then POSTs to /consistency-checks on the new service which returns HTTP 202 immediately and processes the comparison asynchronously (patterns/async-consistency-checker-sidecar).
  • Three-axis response diff: HTTP status, headers, body; with some headers ignored per the endpoint-specific tuning.
  • Per-operation_id metrics: Matched / Unmatched / Failed emitted to systems/prometheus, visualised in systems/grafana.
  • Per-endpoint thresholds: each endpoint had its own target consistency percentage.
  • Cutover via Skipper: one endpoint at a time, rule swap only, no redeploy needed for rollback.
  • Cleanup: ~700 lines of production code + ~1.3k lines of tests removed post-migration.

Failure modes

  • Treating global match-rate as the readiness signal. Zalando's instance canonicalises per-operation_id thresholds because "fixing those last few percentages has a cost higher than the value it brings" for a given endpoint — and different endpoints tolerate different levels.
  • Not accounting for request-doubling cost. Allocating capacity for the new service only, without provisioning the async comparison path + the added load on downstream dependencies, hits a capacity wall mid-migration.
  • Forgetting to tune the comparator. Not-ignored Date, X-Request-Id, different default headers from a new HTTP framework, PDF metadata, unstable collection ordering — all produce spurious Unmatched noise.
  • Running parallel run on non-idempotent endpoints without isolation. Double-sends a webhook, double-publishes an event, double-updates a database row. Symptoms range from log noise to duplicate charges.
  • GDPR leakage through comparison storage. Personal data in request/response bodies flows through the comparator; if stored for offline inconsistency investigation, it's a regulatory exposure. Zalando names this but doesn't mechanise the mitigation.

Contrast with neighbours

  • patterns/shadow-migration — runs the new engine in parallel but typically at dataset / pipeline altitude with reconciliation being statistical-equivalence-style. Parallel run at API altitude shares the dual-invocation mechanic but compares response-level outputs on every request.
  • patterns/feature-flagged-dual-implementation — application-code analogue; both code paths exist in the same process, gated by a flag that routes traffic between them. No output comparison. Cheaper but doesn't verify equivalence.
  • patterns/expand-migrate-contract — the schema-migration analogue at storage layer. Expand (both schemas work), migrate (traffic flips), contract (old schema removed). Parallel run is the behavioural analogue.
  • patterns/canary-and-shadow-cluster-rollout — shadow traffic to the new cluster without comparing responses. Catches crashes on realistic traffic but not semantic divergence.

Seen in

Last updated · 550 distilled / 1,221 read