Skip to content

CONCEPT Cited by 1 source

Init-container IP-gossip pre-migration

Definition

A discipline for upgrading a gossip-based distributed datastore on Kubernetes where pods get new IPs on restart: sequence the IP change and the version change into two distinct gossip-observable events by using a Kubernetes init container that runs the old version on the new IP long enough for gossip to accept the IP change, then flip the main container to the new version.

Why sequence

Distributed protocols that track peer identity by (IP, version) tuples may fail to negotiate two simultaneous changes in the identity tuple, because the peer looks entirely new rather than "same node, new IP" or "same node, new version."

The canonical example: Cassandra 3.11's gossip failed to negotiate initial communication when both the IP and version changed at once (CASSANDRA-19244).

Mechanism

Pod restart → new IP assigned by k8s
Init container:
  - starts old-version (e.g. Cassandra 3.11)
  - gossip: observes IP change under known version
  - init container exits once gossip has converged on new IP
Main container:
  - starts new-version (e.g. Cassandra 4.1) on stable IP
  - gossip: observes pure version change

The init container is a cheap, ephemeral, gossip-only instance of the old version — it doesn't have to serve production traffic; it just has to make gossip happy with the new IP.

Seen in

  • sources/2026-04-07-yelp-zero-downtime-cassandra-4x-upgrade — canonical wiki Seen-in. Yelp's Cassandra 3.11 → 4.1 upgrade across > 1,000 nodes on Kubernetes. "We leveraged the Kubernetes init containers to first start the Cassandra node on a new pod with a different IP with the older 3.11 version, allowing the node to gossip with its new IP address before proceeding with the version upgrade." Presented at KubeCon 2025 ("Upgrading Cassandra on Kubernetes").

Where it generalises

  • Any gossip-based datastore on Kubernetes without static IPs: Cassandra, ScyllaDB, Elasticsearch, Consul, etcd with peer- discovery by IP.
  • Any upgrade that crosses a protocol boundary where the identity tuple evolves (IP, version, cluster-id).
  • Any platform primitive that triggers re-addressing on restart — VMs that get new private IPs, container runtimes without pinned networking.
Last updated · 476 distilled / 1,218 read