PATTERN

Controlled rollout with traffic ramp-up¶

Problem¶

A new variant ships to 100% of users on day one. If the variant has a bug, everyone sees it. If multiple teams are coordinating a launch, they can't all flip 100% at the same instant without synchronisation. Rolling back is disruptive; previewing impact on a small cohort before full launch is the whole point of A/B testing but lives outside the experiment infrastructure.

Solution¶

Expose feature toggles + traffic ramp-up as first-class primitives on the experimentation platform. A team defines the variant, picks an initial exposure percentage, and the platform:

Assigns the chosen fraction of users to the treatment variant via the existing randomization engine.
Collects the same A/B metrics as a full test would.
Lets the team ramp up (or ramp down) exposure on a schedule or on-demand, based on observed metrics.
Lets multiple teams coordinate ramp schedules so a complex cross-team feature goes live at the same effective moment.

The pattern treats rollout control as an experiment whose sample size grows over time, not as a separate deploy process.

Zalando's applied case¶

Octopus (see systems/octopus-zalando-experimentation-platform) added traffic-ramp-up and feature-toggle primitives during its Walk phase to support two customer use cases (Source: ):

Some teams want to gradually increase traffic into the test so they don't accidentally show a buggy variant to many users.
Other teams are working on a complex cross-team feature release and want to release at the same time — the platform provides shared ramp-up scheduling.

Zalando cites Fowler's canonical "Feature Toggles" article as the framing.

Why the platform, not the application¶

Application-layer feature toggles (env var, config file, code branch) work for a small team's one-off launch, but they don't:

Share a randomization basis with running A/B tests (so toggle-exposed users drift from experiment cohorts).
Provide per-percentile observability.
Give leadership visibility into what's rolling out.
Coordinate across teams without ad-hoc Slack threads.

Platform-level toggles sit on top of the A/B engine, inherit all of its telemetry, and integrate with result analysis.

Relationship to other rollout patterns¶

patterns/staged-rollout / patterns/release-train-rollout-with-canary — infrastructure-level progressive rollouts for code artefacts. Complementary: code ships through stages; the feature exposed by that code ramps through cohorts.
patterns/ab-test-rollout (Atlassian) — percentile-guardrail rollout via A/B. Same pattern class: uses the experiment infrastructure to gate rollout on metric movement.
patterns/cohort-percentage-rollout — the plain cohort-% variant without the A/B-backed randomization.

When not to use¶

Security / correctness fixes — the need is to get the fix to 100% as fast as safe, not to measure user-behaviour tradeoffs. Ramp-up is about user-facing variants whose impact is uncertain; security fixes are not that.
Irreversibly persisted state — if the variant writes data under a new schema to the same tables as control, ramping and then reverting is not clean. See patterns/expand-migrate-contract.

Seen in¶

— Octopus adds traffic ramp-up + feature toggles during Walk phase for bug-safety + multi-team-coordination use cases