Skip to content

CONCEPT Cited by 1 source

Ghost-node ejection

Definition

Ghost-node ejection is the automatic removal of stale node-membership references from a distributed cluster's internal state after a node has left the cluster but before the cluster has garbage-collected its metadata references to that node. The "ghost" is a node whose presence has ended in reality but persists as a dangling entry in the cluster's view of membership, quorum, or replica placement.

Without ghost-node ejection, the cluster's internal state drifts from ground truth — members that aren't there still count toward quorums, still appear in replica sets, still consume a slot in ops-console membership lists — and subsequent cluster operations (leader elections, replica placement decisions, rebalances) consult a model of the cluster that has stale nodes in it.

Canonical Redpanda framing

From the sources/2026-03-31-redpanda-261-delivers-the-industrys-first-adaptable-streaming-engine|26.1 launch post:

"Automatic ghost node ejection: Redpanda now automatically cleans up after 'ghost' nodes that have left the cluster, keeping your cluster state pristine."

One sentence of product disclosure. First wiki canonicalisation of the phenomenon.

The failure mode

Typical ghost-node creation paths:

  1. Ungraceful shutdown — a broker process crashes or is terminated without a clean decommission. Cluster metadata still lists the node as a member.
  2. Long network partition — a node is unreachable long enough to be considered "departed" by some observers but cluster metadata still references it.
  3. Hardware replacement — a failed node is replaced with a new hostname / identity; the old identity persists in state until manually reaped.
  4. Decommission step failure — a multi-step decommission workflow aborts mid-flight, leaving the node marked partially-removed.

Consequences of a lingering ghost node:

  • Raft quorum math is off: a 5-node cluster where 2 are ghost-nodes reads as a 5-node cluster for quorum calculations but only has 3 participating — quorum can be lost on a single additional failure.
  • Replica placement decisions consult a model that includes the ghost; partitions may be "assigned" to the ghost and never replicated.
  • Operator ergonomics — dashboards, monitoring alerts, and rpk cluster status show noise that doesn't correspond to real nodes; real incidents are masked.
  • Cleanup operational burden — operators run manual "remove-node" commands periodically to maintain hygiene.

Automatic vs manual cleanup

Pre-26.1, Redpanda operators had tools to manually eject ghost nodes (rpk cluster decommission-node, rpk cluster health inspection). The 26.1 change is automation of the detection + cleanup step: the cluster itself recognises a node has left and prunes references without operator intervention.

This is the cluster-membership altitude analogue of concepts/explicit-teardown-on-completion at the process altitude and patterns/bad-host-auto-drain at the fleet altitude — a reliability primitive that observes end-of-life events and cleans up after them instead of relying on an external reaper.

Distinguishing from node decommission

Axis Decommission (planned) Ghost-node ejection (unplanned)
Trigger Operator runs decommission-node Node departure observed by cluster
Graceful Yes — node finishes in-flight work No — node already gone
Rebalance semantics Partitions moved off before remove Partitions re-assigned after detection
Operator action required Yes — initiated by operator No — fully automatic

Decommission is the graceful happy-path; ghost-node ejection is the automatic catch-all for the unhappy paths.

Mechanism gaps (from the source)

The 26.1 launch post is one sentence of PR framing. Undisclosed:

  • Detection mechanism — heartbeat-timeout-based? Gossip- convergence-based? Raft-membership-change-based?
  • Timeout thresholds — how long must a node be unreachable before it's declared a ghost? Tunable?
  • Interaction with Raft membership changes — does ghost-node ejection go through the Raft config-change protocol, or is it a metadata-only update?
  • False-positive protection — what prevents a flapping network from marking a healthy node as a ghost?
  • Interaction with partitioned minorities — in a split- brain, which side considers the other a ghost?

Seen in

Source

Last updated · 550 distilled / 1,221 read