CONCEPT Cited by 1 source
Multi-Way Join Operator (Flink)¶
A multi-way join operator in Flink is a single keyed operator that joins N streams at once, holding one shared state per key, rather than the Flink 1.x pattern of chaining N−1 binary join operators that each maintain their own state.
The Flink project formalised this in FLIP-516 (Multi-Way Join Operator, 2025-05-19) and shipped it as experimental in Flink 2.1 — see systems/flink-multijoin-operator.
Why it matters¶
The shape directly addresses Flink stateful join state amplification: instead of cloning the relevant data once per binary join in a chain, the operator holds one record per key for the entire multi-way join. Published upstream benchmarks report 2× to over 100×+ increase in processed records and 3× to over 1000×+ smaller state versus the chained-join baseline.
Relationship to the hand-rolled DataStream form¶
The Zalando team reimplemented the same idea on the
DataStream API — via stream union
+ keyBy(SKU) + a custom KeyedProcessFunction holding a single
ValueState — because AWS Managed Flink did not yet offer Flink
2.x. The shape is structurally the same:
keyed stream union into one
operator, one state copy per key, no intermediate state. The
DataStream form is more verbose and less planner-friendly but
works on any Flink version.
Comparison:
| Aspect | Chained binary joins (Flink 1.x Table API) | Hand-rolled KeyedProcessFunction |
Native MultiJoin (Flink 2.1) |
|---|---|---|---|
| Number of operators | N−1 | 1 | 1 |
| State copies per key | N−1 | 1 | 1 |
| API | Declarative SQL | Imperative DataStream | Declarative SQL, planner-emitted |
| Flink version needed | 1.x | 1.x | 2.1+ |
| Join types supported | All | Whatever you code | INNER / LEFT (experimental; RIGHT coming) |
Caveats¶
Flink docs explicitly disclaim: "This is currently in an experimental state — there are open optimizations and breaking changes might be implemented in this version. We currently support only streaming INNER/LEFT joins. Support for RIGHT joins will be added soon."
Seen in¶
- sources/2026-03-03-zalando-why-we-ditched-flink-table-api-joins-cutting-state-by-75-with-datastream-unions — cited as the native Flink 2.1 solution to the problem Zalando solved by hand in DataStream API on Flink 1.20 / AWS Managed Flink; "this feature alone will be worth the wait when we get there, but until then, we're covered by our home-baked solution."
Related¶
- systems/apache-flink — parent engine.
- systems/flink-table-api — API surface where
MultiJoinappears (planner emits it). - systems/flink-multijoin-operator — the system page with FLIP reference and benchmarks.
- concepts/flink-stateful-join-state-amplification — the pathology it solves.
- concepts/flink-keyed-stream-union — the DataStream-altitude equivalent shape.
- patterns/single-valuestate-over-chained-joins — the hand-rolled pattern.