CONCEPT Cited by 1 source
Workflow rollup¶
A workflow rollup is a recursive summary of step statuses across a workflow instance, flattening nested subworkflows + foreach iterations into a count of leaf steps by status. Only leaf steps count — intermediate structural steps (subworkflow containers, foreach containers) are pointers, not work, and are excluded.
Canonical wiki instance is Netflix Maestro's rollup (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator):
"Rollup provides a high-level summary of a workflow instance, detailing the status of each step and the count of steps in each status. It flattens steps across the current instance and any nested non-inline workflows like subworkflows or foreach steps."
Worked example¶
"For instance, if a successful workflow has three steps, one of which is a subworkflow corresponding to a five-step workflow, the rollup will indicate that seven steps succeeded."
Counting: 2 directly-successful steps + 5 leaf steps inside the subworkflow = 7 successful leaf steps. The subworkflow container itself doesn't count — it's a pointer, not work.
Why leaf-only counting is the right rule¶
A workflow orchestrator's real work is executed by leaf steps — the nodes that actually run Docker images, bash scripts, SQL queries, Python functions. Intermediate container steps exist for structuring the workflow graph, not for doing work.
Counting containers in the rollup would:
- Double-count — the subworkflow step would count once, and its internal leaf steps would each count once.
- Obscure failure location — a container step "succeeds" when its internal steps succeed; container failure doesn't pinpoint which leaf step broke.
- Misrepresent work volume — a workflow with 3 containers wrapping 300 leaf iterations would report 3 successful steps instead of 300.
Preserving failure navigation¶
"Rollup also retains references to any non-successful steps, offering a clear overview of step statuses and facilitating easy navigation to problematic steps, even within nested workflows."
The rollup isn't just counts — it keeps handles to failed leaf steps so operators can drill in. For a workflow with thousands of foreach iterations where 17 failed, the rollup gives the 17 handles directly without requiring manual navigation through the aggregation tree.
Merge rules (recursive)¶
For subworkflow steps, the rollup is the rollup of the subworkflow instance — direct passthrough.
For foreach steps, the rollup combines:
- Base rollup — the previous run's aggregated rollup, excluding iterations that restart. When a user restarts a foreach workflow, only the iterations they target re-run; the rest retain their prior terminal state.
- Current-state rollup — periodically updated by aggregating the rollups of currently-running iterations until all reach terminal state.
Eventual consistency¶
"Due to these processes, the rollup model is eventually consistent."
The rollup isn't read directly from primary state; it's derived from a periodic aggregation of running iteration rollups. For deep nesting or wide fan-out, the rollup may momentarily understate or overstate counts during transitions. Operators must accept that the rollup is a close but not instantaneous view.
See concepts/eventual-consistency for the general property.
Why this is non-trivial¶
"While the figure below illustrates a straightforward example of rollup, the calculations can become complex and recursive, especially with multiple levels of nested foreaches and subworkflows."
The recursion depth + fan-out shape of Maestro's composite workflows (foreach inside subworkflow inside foreach) means a naive implementation would re-traverse the whole tree on every query. The production implementation appears to cache interior rollups + invalidate lazily as iterations complete, trading freshness for query latency.
Seen in¶
- sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator — the canonical rollup + recursion framing