CONCEPT Cited by 1 source
Declarative vs Imperative Stream API¶
The declarative vs imperative stream API tradeoff is the
choice between expressing a streaming computation as a query (SQL
/ relational / planner-optimised) versus as an explicit operator
graph (map / keyBy / process / state handling by hand).
The canonical framing from Zalando's 2026-03 Flink post (sources/2026-03-03-zalando-why-we-ditched-flink-table-api-joins-cutting-state-by-75-with-datastream-unions):
"Flink SQL is perfect for 90% of use cases — it's fast, elegant, and maintainable. But a software engineer's value is in recognizing the remaining 10%: the use cases where the abstraction starts costing too much."
The 90 % case¶
Declarative APIs (Flink Table API / SQL, Kafka Streams KSQL, Spark Structured Streaming SQL) win on:
- Expressive density. Joins, aggregations, windowing in a few lines.
- Planner-authored correctness. Watermarks, late-arrival handling, retractions are handled by the engine.
- Maintainability. Readable by team members who don't know the runtime internals.
- Portability across engines. Standard SQL subsets transfer.
The 10 % case — where the abstraction leaks¶
Three common leak shapes observed in this wiki's corpus:
- State amplification across N-way joins. The Flink 1.x Table-API pattern of independent per-join operators compounds state multiplicatively — see concepts/flink-stateful-join-state-amplification. The planner cannot share state across the chain; the user cannot reach in to fix it.
- Per-key temporal logic the planner can't represent cheaply.
When the application knows that an incoming event whose
timestamp <= stored.timestampis a no-op, the imperative code can return before touching state (patterns/event-time-filter-for-state-write-reduction). In SQL the same semantic requires aggregations + ranking functions the planner evaluates expensively. - Dominant one-shot workload inside the job. When one operator does 80 % of the work (hot-key handling, tail-latency filtering, specialised dedupe), the planner's even-handed optimisation is a penalty, not a benefit.
Warning signs that you're in the 10 %¶
- State outgrows the storage allocated per capacity unit (e.g., KPU on AWS Managed Flink).
- Snapshot cost dominates normal workload (concepts/flink-snapshot-savepoint).
- The SQL version has aggregations + ranking functions just to recover the semantics the application naturally encodes.
- Observed cost / throughput is orders-of-magnitude worse than a back-of-envelope estimate.
The imperative rewrite payoff¶
When the 10 % signature fits, the imperative version is often
less verbose, not more. Zalando's remark is worth quoting:
"The 'more manual' approach turned out to be even less verbose
than the SQL version, because our SQL was quite complex, with
aggregations for calculating the maximal timestamps between
several parts of the join and with ranking functions for making
sure the last record from the same part of the join always
wins." Once the temporal logic is "just a few if guards," the
declarative version's verbosity paying down that same logic in
SQL becomes visible.
Seen in¶
- sources/2026-03-03-zalando-why-we-ditched-flink-table-api-joins-cutting-state-by-75-with-datastream-unions — canonical statement of the 90/10 framing; Zalando's 4-way Product Offer Enrichment was the 10 %; rewrite Table API → DataStream API cut state by 76 % and cost by 13 %.
Related¶
- systems/flink-table-api — the declarative side.
- systems/flink-datastream-api — the imperative side.
- concepts/flink-stateful-join-state-amplification — the most common leak shape that sends teams to the imperative API.
- patterns/stream-union-plus-keyed-process-function — the imperative form.