SYSTEM Cited by 2 sources
SWIM protocol¶
SWIM — Scalable Weakly-consistent Infection-style Process Group Membership — is a gossip-based cluster-membership + failure-detection algorithm introduced by Das, Gupta, and Motivala (Cornell, 2002). Canonical paper (PDF).
SWIM is the specific gossip variant called out by sources/2023-07-16-highscalability-gossip-protocol-explained as the one Consul (via HashiCorp's Serf / memberlist libraries) uses in production, and also the one Fly.io's Corrosion is built on.
What SWIM adds over generic gossip¶
- Indirect ping for failure detection. A node that doesn't respond to a direct
pingis not immediately suspected. Instead, the prober askskrandom peers to each send aping-req. If any of the indirect probes gets anack, the target is live (it was the prober-to-target path that was broken). This collapses false-positive dead detection dramatically over pure heartbeat-based gossip. - Gossip piggyback — SWIM carries membership updates piggybacked on the ping/ack/ping-req messages it would send anyway, so the failure-detection channel is the membership-propagation channel.
- Infection-style rumor-mongering — each update (join/leave/suspect/alive) is gossiped a fixed number of times and then retired. Keeps per-node bandwidth bounded.
- Suspicion mechanism — a state between alive and dead: a suspected node is gossiped as "suspect" for a timeout; if it or anyone else refutes the suspicion with a fresher alive message, the node returns to alive. Otherwise it's promoted to confirmed-dead. This handles the asymmetric-latency case where a node is slow to respond, not actually dead.
State machine¶
Each node's membership table tracks every other member in one of four states:
Transitions propagate via piggybacked gossip. Version numbers on each transition ensure the alive ← suspect ← confirmed-dead ordering is monotone per incarnation.
Canonical implementations¶
- Consul / Serf / memberlist (Go) — HashiCorp's SWIM library, used by Consul, Nomad, Vault, and many third-party systems. Probably the most widely-deployed SWIM implementation in production.
- Fly.io Corrosion (Rust) — SWIM membership + QUIC update reconciliation + cr-sqlite CRDT. Canonical wiki primary source.
- Weave Mesh — SWIM-style membership in the Weave Net overlay.
- Cassandra's gossip is SWIM-like but predates and deviates from the canonical SWIM paper (closer to the Dynamo-style heartbeat-plus-reconciliation model).
Tuning parameters¶
- Probe interval (T) — how often each node pings one random peer. Typical values 200 ms – 1 s.
- Indirect-probe count (
k) — number of peers to ask on indirect probe; typically 3. - Suspicion timeout — how long a node stays suspect before being declared dead; depends on cluster size (longer for larger clusters).
- Gossip messages per round — how many piggybacked updates per ping.
Known trade-offs¶
- Partitions are still invisible — if a sub-group is internally connected but severed from the rest, SWIM inside each sub-group stays happy. Inherited from gossip generally; see concepts/gossip-protocol.
- Byzantine peers — SWIM is not byzantine-tolerant. A malicious peer can mark live nodes dead, or refuse indirect probes for others. Pair with authentication/authorisation for any untrusted-peer deployment.
Seen in¶
- sources/2023-07-16-highscalability-gossip-protocol-explained — named as the gossip variant Consul uses.
- sources/2025-10-22-flyio-corrosion — primary-source deep-dive on SWIM at Fly.io scale; the gossip substrate underneath Corrosion.