PATTERN

Gradual per-endpoint cutover¶

Definition¶

Gradual per-endpoint cutover is the rollout discipline of migrating traffic to a new implementation one endpoint (operation_id) at a time via a proxy rule change, rather than flipping a single global switch. Rollback is a proxy-config revert — no redeploy of either application.

Shape¶

Proxy fronts both implementations. All traffic enters through a reverse proxy (Zalando uses Skipper) that maps operation_id → backend.
Each endpoint's readiness is assessed independently. See concepts/consistency-threshold-per-endpoint — threshold can vary per endpoint.
Cutover is a proxy rule change. Edit the route for that operation_id to point at the new backend. No deploy.
Rollback is the inverse rule change. Flip the same route back to the old backend.
Other endpoints continue on the old backend while the migrated ones ride the new one. Cutover is serialised per endpoint, not parallel.

Why not cut over all endpoints at once¶

Per-endpoint consistency thresholds differ. Some endpoints reach the readiness bar earlier than others; gating the whole cutover on the slowest endpoint wastes time.
Blast radius minimisation. A regression on one endpoint doesn't impact others.
Feedback compounds. Each cutover reveals production issues (load profile, dependency behaviour, downstream retry storms) that inform the next cutover.
Development can continue on not-yet-ready endpoints in parallel with early cutovers rather than everything blocking on all endpoints being ready.

Zalando Returns-service instance¶

"The switch was done gradually, and it was done per endpoint to allow the system to be tested in a fully functional way. This was achieved by using a proxy to move the forwarding of the requests to the Returns microservice one by one once they were ready. In our case we used Skipper, an open-source Proxy developed by Zalando."

"By using a proxy to do the traffic switch, rolling back just requires a change to the proxy to migrate the endpoint back to use the previous host instead of the microservice one; this avoids the need of redeploying, making the whole process faster." (Source: )

Minimised amount of endpoint rolled out per switch = 1, so "we avoided introducing a massive set of changes in one go."

When to use¶

Migration at HTTP-API altitude where endpoint-level traffic shaping is cheap (proxy already in the path).
Per-endpoint behaviour varies enough that a single readiness signal doesn't apply.
Rollback must be sub-minute and not require a deploy.
Pairs naturally with patterns/parallel-run-pattern — the parallel run gives per-endpoint readiness signals; this pattern acts on them.

When NOT to use¶

Atomic cross-endpoint contract. If two endpoints must flip together (e.g., migrated service relies on its own new-backend version of another endpoint), per-endpoint cutover breaks that invariant.
No proxy in the path. Introducing a proxy just for the migration adds complexity; a feature-flag or DNS-based cutover may be simpler.
Migration is not behavioural-equivalence-first. If the new service is meant to differ per endpoint, the cutover can't be gated on match rates.

Failure modes¶

Skipping the readiness threshold and doing one-big-batch cutover. Loses the per-endpoint diagnostic value.
Forgetting endpoints in the proxy rule set during cleanup — leaves shadow routes accumulating.
Dual ownership ambiguity during the long tail. Teams get fatigued maintaining the legacy service for the last 5% of endpoints that keep hitting Unmatched on comparator edge cases.
Proxy rule change without audit trail. Cutover moves fast without deploys, which is the feature — but also means no git commit to roll back to if the proxy config source of truth isn't version-controlled.

Seen in¶

Zalando Returns extraction (2021-11-03, ) — Skipper rule change per operation_id as the only cutover mechanism.