Skip to content

CONCEPT Cited by 1 source

Pipeline environment

Definition

A pipeline environment is a named version of a batch pipeline — a complete set of orchestrator workflow definitions (e.g. Airflow DAGs) deployed to a single orchestrator server such that the version can be scheduled and run end-to-end independently from other versions on the same server.

Defined by Zalando in sources/2022-06-09-zalando-accelerate-testing-in-apache-airflow-through-dag-versioning:

"A pipeline environment is a version of a pipeline (set of Airflow DAGs) deployed to an Airflow server on which it can run end-to-end. Each environment contains all DAGs necessary to produce the required output (e.g. marketing ROI in our case), so multiple environments can co-exist on one server and can be used independently."

Why the abstraction matters

Airflow (and most orchestrators) have no native concept of environment. A DAG id is globally unique per Airflow server; a given DAG exists in exactly one version at a time. So if multiple teams want to test conflicting changes to the same DAG, they either:

  • share the test server and collide, or
  • use separate servers (expensive + slow — see MWAA ~30 min/server).

"Pipeline environment" is the layer Zalando adds on top of Airflow to give isolation without multi-server cost: each PR gets its own pipeline env, identified by a branch / feature name, sharing the scheduler process.

Implementation at Zalando

Each pipeline environment is a zip (DAG zip packaging) named for the feature branch (feature1.zip). Airflow's DAG id rewriter injects the branch name into every DAG id at init (qu.test_dagqu.feature1.test_dag), so multiple zips with the same source DAGs can coexist.

Bound to a data environment

A pipeline env must read/write an isolated data layer too, otherwise cross-env data conflicts recreate the original sharing problem. Zalando's model is a 1-to-1 binding between a pipeline environment and a data environment — e.g. pipeline env feature1 reads/writes db_attribution_feature1.

Seen in

Last updated · 550 distilled / 1,221 read