PATTERN Cited by 1 source
Feature-gate pre-migration network rewrite¶
Intent¶
When migrating workloads would break users who have hardcoded literal addresses into their configuration, ship a compatibility feature in-guest first that transparently maps the old address to the new one — so the address-rewrite work can proceed without customer-visible downtime. Then do the config rewrites at operator pace.
When to use¶
- Your platform's addressing scheme requires address changes on migration (classic case: patterns/embedded-routing-header-as-address / 6PN).
- Some fraction of users (possibly including your own teams, as Fly.io discovered) have baked literal addresses into their configs despite DNS being the intended contract.
- You can push a guest-side feature update (init binary, agent, sidecar) before the migration.
- Outright ban-then-enforce would cause unacceptable customer pain.
Structure¶
Phase 1 — Ship the bridge:
- Extend the guest-side system (init, agent, sidecar) with an
address-mapping feature that maintains a table of
old_address → new_address and intercepts traffic addressed to
the old one.
- Validate the mapping works in production on workloads that
haven't yet been migrated.
Phase 2 — Migrate: - Migration now changes the address safely — traffic to the old address still works because the bridge redirects it.
Phase 3 — Rewrite configs: - Sweep the fleet for literal-address uses. - Update configs to use DNS or the new address. - This is the "burned several weeks" step in Fly.io's telling — labor-intensive but not time-critical.
Phase 4 — Retire the bridge (eventually): - Once the config sweep is complete, the bridge can be removed. - This step may never complete for legacy workloads; that's a platform-hygiene debt to track, not an urgent problem.
Canonical example¶
From Fly.io 2024-07-30, on the 6PN address changes that migration requires:
The obvious fix for this is not complicated; given
flyctlssh access to a Fly Postgres cluster, it's like a 30 second ninja edit. But we run a lot of Fly Postgres clusters, and the change has to be coordinated carefully to avoid getting the cluster into a confused state. We went as far as adding feature to ourinitto do network address mappings to keep old 6PN addresses reachable before biting the bullet and burning several weeks doing the direct configuration fix fleet-wide.
The "network address mappings" feature in Fly init is the Phase-1 bridge. The "several weeks doing the direct configuration fix" is Phase 3. Phase 4 is presumably ongoing.
Consequences¶
Upsides:
- Migration stops being gated on cluster-config updates. The platform can proceed at its own pace.
- No customer-visible downtime during the config rewrites.
- Discovers unknown literal-address users — when the bridge sees intercept traffic for an address that shouldn't be in use anymore, you've found another config to update.
Downsides:
- The bridge has a shelf life but in practice may outlive its intended purpose.
- Guest-side code surface has grown — init or agent has become responsible for a compatibility shim.
- Doesn't prevent the antipattern; it just mitigates the damage. The literal addresses are still there in configs; the next architectural change that needs an address change hits the same problem unless Phase 4 is completed.
Seen in¶
- sources/2024-07-30-flyio-making-machines-move — Fly.io's handling of literal 6PN addresses in Fly Postgres cluster configs.
Related¶
- systems/fly-init — The guest-side bridge carrier.
- concepts/hardcoded-literal-address-antipattern — The underlying failure mode.
- concepts/embedded-routing-in-ip-address — The address scheme that forced the issue.
- patterns/embedded-routing-header-as-address — The general pattern.