PATTERN Cited by 1 source
Composite workflow pattern¶
Summary¶
Ship three engine-native workflow primitives — foreach,
subworkflow, conditional branch — that compose into higher-
order workflows. Each primitive has a deterministic semantics
implemented inside the orchestrator engine (not in user code), and
their composition covers common production shapes like auto-
recovery, backfills, and ML model tuning.
Canonical wiki instance: Netflix Maestro's three composite primitives (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator).
Problem¶
Workflow orchestrators handle the "N-step linear pipeline" case well. Production patterns that need more than that include:
- Data backfills — run the same pipeline across 1000+ historical date partitions.
- ML hyperparameter sweeps — run the same training workflow across a grid of hyperparameter combinations.
- Audit-remediate-retry — check output; if invalid, run a remediation flow; retry main workflow.
- Hierarchical reuse — a complex workflow reused as a component inside other workflows.
Without engine-native primitives, users fall back to either:
- External drivers — a user script loops + re-triggers the workflow. Loses orchestrator visibility; no unified lineage.
- Hand-rolled DAG expansion — generate the DAG with 1000 nodes at definition time. Doesn't scale.
- One-off scripts — every team reinvents foreach, retry loops, conditional subworkflow calls with subtle bugs.
Solution¶
Three primitives, implemented in the engine — so they're optimised uniformly, have consistent semantics across the organisation, and compose cleanly.
1. Foreach¶
- Models iteration over a parameter collection.
- "Each iteration of the foreach loop is internally treated as a separate workflow instance, which scales similarly as any other Maestro workflow." (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator)
- The foreach step monitors + collects per-iteration statuses.
- Parameters
loop_params+loop_indexare engine-injected for each iteration.
Canonical use: data backfill, ML hyperparameter tuning, partition processing.
2. Subworkflow¶
- A step runs another workflow — "workflow as a function."
- Enables a graph of workflows with shared components.
- Output parameters flow back to the caller.
Canonical use: shared common functions across teams — "complex workflows consisting of hundreds of subworkflows to process data across hundreds tables, where subworkflows are provided by multiple teams."
3. Conditional branch¶
- Subsequent steps run only if an upstream condition is met.
- Conditions are SEL expressions evaluated at runtime against step parameters + outputs.
- Combined with other primitives for audit-remediate-retry + error-handling flows.
Canonical use: skip expensive steps when inputs don't justify them; take remediation branches on audit failures.
Composition — the auto-recovery example¶
The post's worked example (rendered as text):
┌────────────────────────┐
│ subworkflow: job1 │ (ETL + audit subworkflow)
└───────┬────────────────┘
│
▼
┌────────────────────────┐
│ status_check_step │ (read job1's audit status via SEL)
└───────┬────────────────┘
│
▼
┌────────────────────────┐
│ conditional branch │
│ if audit_failed: │
└───┬────────────────────┘
│ │
▼ ▼
recovery (complete
subworkflow workflow)
│
▼
┌────────────────────────┐
│ subworkflow: job2 │ (retry ETL + audit with same shape as job1)
└────────────────────────┘
The whole flow is composable — subworkflow + conditional branch +
subworkflow — and the components are engine primitives, not
custom per-team glue.
Other composition shapes¶
- Nested foreach inside foreach — 2D parameter sweeps (date × dataset).
- Foreach over subworkflows — same pipeline invoked with heterogeneous inputs.
- Conditional subworkflow selection — pick one of N subworkflows based on input characteristics.
Why engine-native implementation matters¶
"Direct engine support not only enables us to optimize these patterns but also ensures a consistent approach to implementing them."
Three concrete wins:
- Uniform optimisation — the engine can parallelise foreach iterations, batch subworkflow creation, short-circuit conditional evaluation without tenant-code opt-in.
- Consistent failure modes — retry, restart, breakpoints, rollup all interact correctly with the primitives.
- Lineage + observability — all composite structure is visible to the orchestrator for audit + debugging.
When not to use¶
- Simple linear pipelines — don't reach for foreach / conditional when a straight DAG suffices.
- Cross-process orchestration — if the composition needs to span orchestrator boundaries, use signals instead.
- Real-time control loops — orchestrator composition is the wrong altitude for sub-second control; use Temporal-style workflow-as-code or a dedicated control plane.
Seen in¶
- sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator — the three primitives + auto-recovery example