Skip to content

SYSTEM Cited by 1 source

Apache ZooKeeper

Apache ZooKeeper is a long-running distributed-coordination service providing strongly-consistent hierarchical key-value storage (the zNode tree), ephemeral nodes tied to client session lifetime, and a watch mechanism that notifies a subscribing client when a named zNode changes. Historically the default coordinator for Apache Kafka clusters; being retired in Kafka 4.0 in favour of KRaft.

Stub page — expand on future ZooKeeper-internals sources (ZAB, election, sessions, watch scalability, operations). The canonical wiki entry point today is its Kafka-deployment role as described in the 2024-05-09 Kafka 101 post.

Role in classical Kafka (pre-3.3)

From Stanislav Kozlovski's Kafka-101 explainer (sources/2024-05-09-highscalability-kafka-101):

  • Controller election: every broker raced to register the /controller zNode on cluster start-up; the first to succeed became the active Controller. On Controller death, the next broker to register the zNode became the new Controller.
  • Cluster metadata storage: "Kafka used to persist all sorts of metadata in ZooKeeper, including the alive set of brokers, the topic names and their partition count, as well as the partition assignments."
  • Change notification via watches: "Kafka also used to heavily leverage ZooKeeper's watch mechanism, which would notify a subscriber whenever a certain zNode changed."

Retirement from Kafka

Kozlovski: "For the last few years, Kafka has actively been moving away from ZooKeeper towards its own consensus mechanism called KRaft." Production-ready KRaft shipped in Kafka 3.3 (October 2022); Kafka 4.0 (expected Q3 2024) removes ZooKeeper entirely. See systems/kraft and concepts/kraft-metadata-log.

Why replace:

  • Operational burden of running a second consensus system alongside Kafka.
  • Watch + zNode abstractions are different from Kafka's own log-replication abstractions — KRaft collapses cluster metadata into an ordinary Kafka log.
  • Metadata scale (partition count × broker count × replica count) grew past what the watch-driven pull model handled gracefully.

Seen in

Last updated · 319 distilled / 1,201 read