Skip to content

SYSTEM Cited by 1 source

Flink MultiJoin Operator (FLIP-516)

MultiJoin Operator is an experimental Flink 2.1 feature (introduced via FLIP-516, 2025-05-19 ) that plans an N-way streaming join into a single keyed operator holding one shared state, rather than the Flink 1.x model of chained independent join operators each keeping their own RocksDB copy of both inputs.

Motivation

The problem it solves is exactly Flink stateful join state amplification: in Flink 1.20's Table API, each JOIN in a chain materialises its own full state for late/updated-arrival correctness, so four joins carry four copies of the keyspace's relevant fields. MultiJoin unifies them under a single operator, keyed on the common join key, and stores one record per key.

"This is currently in an experimental state — there are open optimizations and breaking changes might be implemented in this version. We currently support only streaming INNER/LEFT joins. Support for RIGHT joins will be added soon."

Published benchmark

Flink docs report MultiJoin vs chained joins:

  • 2× to over 100×+ increase in processed records.
  • 3× to over 1000×+ smaller state.

The ratio range reflects how state amplification scales with the number of joined streams and per-stream cardinality.

Seen in

Last updated · 507 distilled / 1,218 read