PATTERN Cited by 1 source

Composite workflow pattern¶

Summary¶

Ship three engine-native workflow primitives — foreach, subworkflow, conditional branch — that compose into higher- order workflows. Each primitive has a deterministic semantics implemented inside the orchestrator engine (not in user code), and their composition covers common production shapes like auto- recovery, backfills, and ML model tuning.

Canonical wiki instance: Netflix Maestro's three composite primitives (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator).

Problem¶

Workflow orchestrators handle the "N-step linear pipeline" case well. Production patterns that need more than that include:

Data backfills — run the same pipeline across 1000+ historical date partitions.
ML hyperparameter sweeps — run the same training workflow across a grid of hyperparameter combinations.
Audit-remediate-retry — check output; if invalid, run a remediation flow; retry main workflow.
Hierarchical reuse — a complex workflow reused as a component inside other workflows.

Without engine-native primitives, users fall back to either:

External drivers — a user script loops + re-triggers the workflow. Loses orchestrator visibility; no unified lineage.
Hand-rolled DAG expansion — generate the DAG with 1000 nodes at definition time. Doesn't scale.
One-off scripts — every team reinvents foreach, retry loops, conditional subworkflow calls with subtle bugs.

Solution¶

Three primitives, implemented in the engine — so they're optimised uniformly, have consistent semantics across the organisation, and compose cleanly.

1. Foreach¶

Models iteration over a parameter collection.
"Each iteration of the foreach loop is internally treated as a separate workflow instance, which scales similarly as any other Maestro workflow." (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator)
The foreach step monitors + collects per-iteration statuses.
Parameters loop_params + loop_index are engine-injected for each iteration.

Canonical use: data backfill, ML hyperparameter tuning, partition processing.

2. Subworkflow¶

A step runs another workflow — "workflow as a function."
Enables a graph of workflows with shared components.
Output parameters flow back to the caller.

Canonical use: shared common functions across teams — "complex workflows consisting of hundreds of subworkflows to process data across hundreds tables, where subworkflows are provided by multiple teams."

3. Conditional branch¶

Subsequent steps run only if an upstream condition is met.
Conditions are SEL expressions evaluated at runtime against step parameters + outputs.
Combined with other primitives for audit-remediate-retry + error-handling flows.

Canonical use: skip expensive steps when inputs don't justify them; take remediation branches on audit failures.

Composition — the auto-recovery example¶

The post's worked example (rendered as text):

┌────────────────────────┐
│ subworkflow: job1      │    (ETL + audit subworkflow)
└───────┬────────────────┘
        │
        ▼
┌────────────────────────┐
│ status_check_step      │    (read job1's audit status via SEL)
└───────┬────────────────┘
        │
        ▼
┌────────────────────────┐
│ conditional branch     │
│   if audit_failed:     │
└───┬────────────────────┘
    │       │
    ▼       ▼
 recovery   (complete
 subworkflow workflow)
    │
    ▼
┌────────────────────────┐
│ subworkflow: job2      │    (retry ETL + audit with same shape as job1)
└────────────────────────┘

The whole flow is composable — subworkflow + conditional branch + subworkflow — and the components are engine primitives, not custom per-team glue.

Other composition shapes¶

Nested foreach inside foreach — 2D parameter sweeps (date × dataset).
Foreach over subworkflows — same pipeline invoked with heterogeneous inputs.
Conditional subworkflow selection — pick one of N subworkflows based on input characteristics.

Why engine-native implementation matters¶

"Direct engine support not only enables us to optimize these patterns but also ensures a consistent approach to implementing them."

Three concrete wins:

Uniform optimisation — the engine can parallelise foreach iterations, batch subworkflow creation, short-circuit conditional evaluation without tenant-code opt-in.
Consistent failure modes — retry, restart, breakpoints, rollup all interact correctly with the primitives.
Lineage + observability — all composite structure is visible to the orchestrator for audit + debugging.

When not to use¶

Simple linear pipelines — don't reach for foreach / conditional when a straight DAG suffices.
Cross-process orchestration — if the composition needs to span orchestrator boundaries, use signals instead.
Real-time control loops — orchestrator composition is the wrong altitude for sub-second control; use Temporal-style workflow-as-code or a dedicated control plane.

Seen in¶

sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator — the three primitives + auto-recovery example