Skip to content

SYSTEM Cited by 1 source

Cruise Control

Cruise Control is LinkedIn-originated open-source software for automatically rebalancing Apache Kafka clusters. It reads per-broker metrics from a Kafka topic, builds an in-memory model of the cluster, runs that model through a greedy heuristic bin-packing algorithm to decide a better partition-replica assignment, and applies the result incrementally via Kafka's low-level reassignment API.

Upstream: github.com/linkedin/cruise-control.

Why rebalancing is hard

Kozlovski: "Essentially the NP-hard Bin Packing problem at heart, the community has developed a few tools and even a fully-fledged component to handle this." See concepts/bin-packing.

At non-trivial cluster scale, varying client workloads produce hot spots and inefficient resource distribution over time; Kafka's reassignment primitive exposes the move but not the plan. Cruise Control owns the plan.

Architecture (Kafka-101 framing)

"Cruise Control, originally also developed at LinkedIn, is an open-source component which reads all brokers' metrics from a Kafka topic, builds an in-memory model of the cluster and runs that model through a greedy heuristic bin-packing algorithm to optimize the model via reassigning partitions. Once it has computed a more efficient model, it begins incrementally applying it to the cluster by leveraging the low-level Kafka reassignment API." (Source: sources/2024-05-09-highscalability-kafka-101)

Goals — prioritised-constraints framing

Cruise Control exposes "a configurable set of rebalancing logic consisting of multiple Goals, each of which is ran with its associated priority and balances on its associated resource." Goal classes cover: replica-count balance, disk usage, network in/out, CPU, partition-leader count on each broker, rack awareness, etc. A cluster state is acceptable if all higher-priority goals are satisfied; optimisation is carried out under that lexicographic constraint.

Continuous operation

"Cruise Control continuously monitors the cluster's metrics and automatically triggers a rebalance once it notices the metrics going outside of its defined acceptable thresholds."

Add/remove broker API

Also exposes add broker / remove broker as first-class operations. Even with Tiered Storage, Kafka brokers remain stateful (hot-tier log segments, index files), so both add + remove require moving replicas — Cruise Control owns the motion planning.

Seen in

  • sources/2024-05-09-highscalability-kafka-101 — canonical wiki description of Cruise Control as the open-source bin-packer for Kafka rebalancing, including the Goals framing + continuous-monitor trigger model + add/remove-broker API.
Last updated · 319 distilled / 1,201 read