CONCEPT Cited by 1 source
Zero-code-change migration¶
Zero-code-change migration is the organisational and engineering commitment that, when a platform team moves user workloads from one execution substrate to another, no user code is allowed to change — training scripts, preprocessing logic, inference code, service handlers, config, and invocation shape all keep working identically on the new substrate. The burden of compatibility lives entirely in the platform.
Why take the constraint¶
The alternative — requiring every user team to port their workloads — fails at scale for two reasons:
- Organisational cost. Hundreds of engineers across dozens of teams rewriting business-critical workflows is untenable. Productivity loss over the migration window dwarfs any cloud bill savings.
- Risk. User-code changes during an infra migration conflate two sources of breakage (infra + user logic), making regressions hard to localise.
Taking the zero-code-change constraint up front re-frames a cloud migration from "move the workloads" to "reproduce the old environment on the new substrate with bit-for-bit behavioural equivalence". Lyft's ML platform team on the LyftLearn 2.0 migration was explicit about this:
"Forcing hundreds of users across dozens of teams to rewrite their business-critical ML workflows was not an option. The cost of such a disruption in terms of lost productivity and engineering effort would have made the migration untenable, which meant the burden of compatibility was entirely on our platform." (Source: sources/2025-11-18-lyft-lyftlearn-evolution-rethinking-ml-platform-architecture)
What the platform has to absorb¶
Adopting the constraint pushes all migration complexity into the platform. On LyftLearn 2.0 specifically:
- concepts/environmental-parity becomes the primary metric — the new environment must be indistinguishable from the old from user code's viewpoint.
- Container-entrypoint compatibility layers re-synthesise missing platform primitives (credentials, env vars, metrics transport, hyperparameter passing) at startup.
- Cross-platform base images ship one image that works on multiple execution substrates by detecting its runtime context.
- Provider-side negotiations (AWS made Studio Domain networking changes in Lyft's account) close cross-stack networking gaps that customer-side config couldn't.
See patterns/zero-code-change-platform-migration for the pattern in implementation form.
Trade-off¶
The platform team accepts substantially more complexity in exchange for organisational tractability. The migration becomes longer and harder at the platform layer, but nearly invisible to users — which is what makes large migrations ever finish.
Seen in¶
- sources/2025-11-18-lyft-lyftlearn-evolution-rethinking-ml-platform-architecture — the canonical wiki instance; Kubernetes → SageMaker migration for compute with no user-code changes.