PATTERN Cited by 2 sources
Signal-based publish/subscribe step triggering¶
Summary¶
In a workflow orchestrator, unify inter-workflow triggering and intra-workflow step dependency behind a single signal primitive. Signals are messages carrying parameter values, produced either by step outputs or external systems, that unblock subscribing steps / start subscribing workflows when their match criteria are met.
Canonical wiki instance: Netflix Maestro's signal mechanism (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator).
Problem¶
Workflow orchestrators historically have split mechanisms for triggering:
- Intra-workflow dependency: DAG edges, conditional-branch predicates.
- Inter-workflow triggering from internal producers: Airflow
ExternalTaskSensor, Step Functions invoking other state machines. - Inter-workflow triggering from external producers: webhook handlers, SNS subscribers, Kafka consumers written as custom glue.
Each mechanism has its own semantics (at-most-once vs at-least-once vs exactly-once), its own failure modes, its own audit trail, and its own configuration surface. Cross-team integration typically requires wiring two+ of these together.
Solution¶
One signal primitive, with three properties:
- Dual semantics — the same signal serves both pub-sub (one producer unblocks many consumers) and trigger (external event starts a workflow).
- Parameter-carrying matching — signals carry parameter
values; subscribers specify mapped-field subsets + operators
(
<,>,=, etc.) for matching. - Exactly-once per match — subscribers run exactly once per matching signal or joined-signal set, regardless of substrate delivery guarantees.
Source coverage¶
Signals can be produced by:
- Step outputs inside the same orchestrator — "One step can publish an output signal … that can unblock the execution of multiple other steps that depend on it." (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator)
- External systems like Kafka / SNS / webhook endpoints.
This closes the producer-origin gap — the same primitive handles within-workflow, cross-workflow, and cross-system triggering uniformly.
Signal lineage¶
"Maestro supports 'signal lineage,' which allows users to navigate all historical instances of signals and the workflow steps that match (i.e. publishing or consuming) those signals."
The orchestrator persists a queryable history of producer / consumer / match events — essential for debugging ("why didn't my downstream trigger?") + audit ("what flows consumed this table update?").
Exactly-once guarantee¶
"Signal triggering guarantees exactly-once execution for the workflow subscribing a signal or a set of joined signals."
See concepts/exactly-once-signal-trigger for the architectural shape — orchestrator-level dedup state over an at-least-once substrate.
When to use¶
- Multi-team workflow integration — signals become the organisational protocol without N² wiring.
- Data-pipeline trigger — ETL workflow produces a signal with the partition key; downstream ML/reporting workflows subscribe to specific partitions.
- Event-driven ML retraining — new data signals trigger retraining workflows conditional on sufficient accumulated data.
- External-event-driven workflows — webhook / Kafka event → signal → workflow trigger with exactly-once.
When not to use¶
- Tight intra-team dependencies — for a small set of workflows owned by one team, DAG edges may be simpler.
- Real-time triggering — signals add latency; sub-second triggering is better served by native Kafka consumers.
- Orchestrators that don't support exactly-once — the pattern becomes dangerous if the substrate is at-least-once and the orchestrator doesn't add dedup.
Structure¶
External events Internal step outputs
(SNS / Kafka / HTTP) (workflow A, step 3)
│ │
▼ ▼
┌─────────────────────────────────────────┐
│ Maestro signal service │
│ · persist signal (lineage) │
│ · match against subscribers │
│ · dedup per (consumer, mapped-key) │
└─────────────────────────────────────────┘
│
┌──────────┼──────────┐
▼ ▼ ▼
┌──────────┐ ┌─────────┐ ┌──────────┐
│ workflow │ │ step X │ │ workflow │
│ B start │ │ unblock │ │ C start │
└──────────┘ └─────────┘ └──────────┘
Example (from the post)¶
An ETL workflow updates a table with data and publishes a signal
containing the partition key. Three downstream consumers subscribe
to signals with table=X AND partition>=Y:
- Team B's ML training workflow — triggers exactly once per partition match.
- Team C's reporting workflow — triggers exactly once per partition match.
- Team D's audit-sampling step — unblocks step when threshold partitions available.
None of these teams know about the others; the signal protocol is the integration contract.
Seen in¶
- sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator — the primitive + exactly-once guarantee
- sources/2024-07-22-netflix-supporting-diverse-ml-systems-at-netflix — the organisational-protocol framing