PATTERN Cited by 1 source
Reactive-plus-proactive stream processing¶
Definition¶
A stateful stream processing pattern where a single processor class implements two distinct execution paths:
- Reactive path — responds to incoming events (data-driven): e.g., processing session starts and ends as they arrive from Kafka.
- Proactive path — fires on a timer schedule independent of input data: e.g., emitting heartbeats, checking timeouts, producing derived signals.
This separates input-triggered logic from time-triggered logic within one stateful operator, avoiding the need for external cron schedulers or secondary systems.
Shape (Spark transformWithState)¶
class SessionizationProcessor extends StatefulProcessor {
// Reactive: fires on each input record
handleInputRows(rows) { /* session start/end logic */ }
// Proactive: fires on timer expiry (every 30s)
handleExpiredTimer(timerKey) { /* heartbeat / timeout logic */ }
}
When to use¶
- Workloads requiring both immediate response to events AND scheduled output (heartbeats, SLA checks, timeouts).
- Session-lifecycle tracking where downstream needs periodic updates even when nothing is happening upstream.
- Any scenario with high output amplification — proactive path dominates output volume.
Trade-offs¶
- Pro: Single processor manages entire entity lifecycle — simpler than coordinating an event-driven service + a separate timer/cron service.
- Pro: State is shared between both paths — the timer can access the same session state that input processing writes.
- Con: Timer granularity depends on the engine's execution model. Micro-batch mode aligns timers to batch boundaries; Real-Time Mode gives sub-second precision.
- Con: Debugging is harder — output may come from either path, requiring correlation of timer-driven vs input-driven events.
Seen in¶
- sources/2026-06-03-databricks-apache-spark-real-time-mode-for-gaming —
handleInputRows()processes GameStart/GameEnd events;handleExpiredTimer()emits heartbeats every 30s and detects session timeouts. The proactive path generates 16× more output than the reactive path.
Related¶
- concepts/stateful-stream-processing — the underlying execution model
- patterns/timer-driven-heartbeat-emission — specific instance of the proactive path
- systems/spark-streaming — implements this via
transformWithStatein Real-Time Mode