Skip to content

PATTERN Cited by 1 source

Weighted-sum strategy migration

Weighted-sum strategy migration: when gradually migrating between two algorithms that produce the same shape of numeric output (e.g. two load-balancing strategies producing endpoint weights), blend their outputs via a percentage feature flag rather than flag-gating which algorithm's output the client uses. Every client sees the same blended output at any instant, regardless of feature-gate bucketing.

Problem

The naïve rollout: flip a feature flag per client, bucket clients into "old-strategy" and "new-strategy" groups. At 30% rollout, 30% of clients route by the new weights and 70% by the old. Works fine for most A/B tests.

For load balancing, it doesn't: different clients would route the same request class to different backends, potentially creating routing inconsistency:

  • A sticky-routing contract (session affinity, consistent hashing for shard correctness) can break across the bucketed cohorts.
  • Cache-warming assumptions can get invalidated on the 70% side when the 30% side moves traffic.
  • Metrics attribution (was this outage because of the new strategy? or because its 30% of traffic coincidentally hit something else?) becomes tangled.

Pattern

Have each strategy's control plane write its outputs into separate entries in a shared store:

routing-db/{service}/strategy-A/endpoint-1 → weight
routing-db/{service}/strategy-A/endpoint-2 → weight
...
routing-db/{service}/strategy-B/endpoint-1 → weight
routing-db/{service}/strategy-B/endpoint-2 → weight

Every client reads both sets on every update, plus a percentage α (a shared feature flag), and computes:

effective_weight[endpoint] = w_A[endpoint] × α + w_B[endpoint] × (1 − α)

α = 0 → pure old; α = 1 → pure new. All clients see the same α and the same weights; there is no per-client bucketing.

The migration is driven by changing α over time:

  • Start at α = 0 for weeks while the new strategy computes weights (so you can compare).
  • Ramp α to 5%, 10%, 25%, 50%, 100% on the operator's schedule.
  • Roll back instantly by setting α back to 0.

What this does and doesn't give you

Does: - Consistent routing across the fleet at every α. No two clients disagree about which backend sees what fraction of traffic. - Instant rollback. Single flag flip, no client redeploys. - Observability of "halfway" states. Each strategy's weights are separately visible in the routing DB, so you can see how each would route without actually committing. - Parallel correctness proving. Run both for weeks at α = 0, compare outputs offline, only then start ramping.

Doesn't: - Give you a true A/B in the "measure strategy A's behavior vs strategy B's behavior in isolation" sense — the blend is always mixed. That's the tradeoff for consistency. - Work when the strategies produce qualitatively different output shapes (e.g. one produces weights, another produces shard-routing decisions). Both must share a numeric surface amenable to linear combination. - Solve the underlying "is the new strategy safe at 100%?" question — it just gives a smooth rollout path.

When to use it

  • Migrating between LB algorithms that both produce per-endpoint weights.
  • Migrating between ranking / scoring models whose outputs compose linearly.
  • Any strategy migration where routing consistency across clients matters more than isolation per cohort.

When not to use it

  • A/B experiments where you want isolated measurement of each arm's effect. Use patterns/ab-test-rollout instead.
  • Qualitatively different strategies that don't share a numeric output space.
  • Single-decision-per-request migrations (e.g. which database to query) where blending makes no sense.

Seen in

  • sources/2024-10-28-dropbox-robinhood-in-house-load-balancingRobinhood's migration between round-robin and PID-based load balancing. Both strategies' LBS instances write their own endpoint weights into distinct routing-DB entries; clients blend with a percentage gate. Example from the post: endpoint A weighted 100 under PID and 200 under round-robin; at 30% PID feature-gate the client sees 100 × 0.3 + 200 × 0.7 = 170. Stated value: "every client sees the same weight assignment for endpoints while gradually migrating to the new load balancing strategy."
Last updated · 200 distilled / 1,178 read