CONCEPT Cited by 1 source
Portable execution environment¶
A portable execution environment is a uniquely named, fully declared package + binary environment that can be built once and rehydrated anywhere — including at execution time, not just deploy time. It is the Metaflow alternative to asking data scientists to hand-manage Docker images.
Why not just Docker¶
Open-source Metaflow originally relied only on @conda. Netflix's
motivation for going further:
"[W]e need to help the developer to package and rehydrate the whole execution environment of a project in a remote pod in a reproducible manner (preferably quickly). Specifically, we don't want to ask developers to manage Docker images of their own manually, which quickly results in more problems than it solves. This is why Metaflow provides support for dependency management out of the box." (Source: sources/2024-07-22-netflix-supporting-diverse-ml-systems-at-netflix)
What "portable" buys you¶
The Netflix
metaflow-nflx-extensions
package added two capabilities on top of open-source @conda /
@pypi (the latter was later upstreamed):
- Unique-named environments — the
metaflow environmentCLI builds an env and assigns it a unique name tied to the run ID + model type, so other flows / steps can reference that exact environment later. - Execution-time env fetch — unlike typical flows where
@conda/@pypienvs are resolved at deploy time, portable envs can be resolved at execution time. This is what makes patterns/dynamic-environment-composition possible — a higher-order flow can build an env that includes another flow's deps + its own, and use it in a subsequent step.
Canonical use case — the "Explainer flow"¶
Netflix trains many models, each with its own deps. To train an explainer model per trained model, the training system needs:
- Access to the original model and its training environment.
- Dependencies specific to building the explainer model itself.
Explainer flow is event-triggered by upstream Model A/B/C flows.
Its build_environment step calls metaflow environment to build
an env that includes both the input-model deps and the explainer-
specific deps, assigning a unique name. Its train_explainer step
then fetches that exact env at execution time and runs in it.
This pattern — higher-order training system that takes another system's env as input and produces an env-aware artifact — is the generalisable lesson.
Stability caveat¶
Portable environments are implemented via Metaflow's extension mechanism, which is "publicly available but subject to change, and hence not a part of Metaflow's stable API yet." Use at your own risk.