Skip to content

PATTERN Cited by 1 source

Kafka broadcast for shared state

When every node in a service cluster must converge on a shared, eventually-consistent view of state — and consensus or bespoke gossip would be overkill — publish each node's local updates to a Kafka topic that all cluster nodes consume. Every node thereby receives every other node's updates.

Shape

Node A     Node B      Node C      Node D
  |          |           |           |
  v          v           v           v
  +-------- Kafka broadcast topic (all nodes subscribe) --------+
  ^          ^           ^           ^
  |          |           |           |
Node A     Node B      Node C      Node D
  (each node consumes from the topic, including its own writes)

Each node:

  1. Receives some slice of work from upstream load balancing.
  2. Derives shared state from its slice of the work.
  3. Publishes deltas (or complete state fragments) to the broadcast topic.
  4. Consumes the topic and merges everyone's updates — including its own — into its local in-memory view.

When this is the right choice

  • Convergence, not consensus. You want eventual consistency (merge the union of statements), not a totally-ordered log.
  • Low update rate. Broadcast cost scales with O(updates × nodes).
  • Existing Kafka. Reuse the org's Kafka rather than operating a separate gossip mesh / Raft cluster.
  • State is idempotently mergeable. Each update is a self-contained statement that's safe to apply multiple times or out of order (e.g. a heartbeat time range).

When it's the wrong choice

  • High update rate or large per-node state. Broadcast becomes a bandwidth tax.
  • Strict consistency needed. Kafka is at-least-once; nodes may see different state briefly. Consensus is required when atomic transitions matter.
  • No shared Kafka infrastructure. The "simple" framing depends on operational Kafka already being cheap.

Canonical instance

systems/netflix-flowcollector — each of 30 c7i.2xlarge nodes sees some subset of the ~5M flows/sec and publishes learned (ip, workload_id, t_start, t_end) time ranges to a broadcast Kafka topic. Every node subscribes and merges peer time ranges into its local per-IP ownership map. Verbatim acknowledgement from the post: "Although more efficient broadcasting implementations exist, the Kafka-based approach is simple and has worked well for us."

Trade-offs explicitly chosen

  • Bandwidth over latency. Every broadcast node sees every other node's write; cheaper than replicated consensus but O(N²) on cluster size.
  • Simplicity over optimality. "We implemented a broadcasting mechanism using Kafka, where each node publishes learned time ranges to all other nodes." Netflix explicitly names the tradeoff.
  • Kafka's ordering guarantees (per-partition) are sufficient when merges are order-independent.

Alternatives

  • Custom gossip (SWIM, Corrosion, etc.). Purpose-built, lower overhead, but an operational burden.
  • Raft / Paxos log. Over-strong for convergent state.
  • Shared write-through cache (Redis, etc.). Only works for centralised state; introduces single-point dependency.

Seen in

Last updated · 319 distilled / 1,201 read