Skip to content

CONCEPT Cited by 1 source

Anti-entropy (gossip)

Anti-entropy is the family of gossip algorithms in which nodes compare full replicas (or their compact summaries) and patch the differences. Coined by Demers et al. 1987 in the original epidemic-algorithms paper. The naming is literal — "introduced to reduce the entropy between replicas of a stateful service such as the database" (Source: sources/2023-07-16-highscalability-gossip-protocol-explained).

Mechanism

On each cycle, a node picks a random peer and the two exchange either:

  • Whole-dataset — cheapest to implement, expensive in bandwidth.
  • Merkle-tree digests — recursively compare subtree hashes, transfer only leaves that differ. Bandwidth cost collapses from O(dataset) to O(diff). Canonical in Cassandra read-repair.
  • Checksums or recent-update lists — lightweight approximations when trees are overkill.

The node with the newer value for each key wins (per (generation, version) logical timestamp).

Contrast with rumor-mongering

Anti-entropy concepts/rumor-mongering
Cadence Slower cycles Faster cycles
Bandwidth/cycle Higher Lower
Termination Unbounded (runs forever) Message aged out after k rounds
Guarantee Eventually covers every key-value High probability, but a rumor can still miss nodes

The trade-off: anti-entropy is the safety net that catches anything rumor-mongering missed. Production gossip stacks (Cassandra, Dynamo) usually run both — rumor-mongering for fresh writes, anti-entropy for long-tail repair.

Bandwidth cost and why Merkle trees matter

Naive anti-entropy transfers the whole dataset each round — "unnecessary bandwidth usage" at any nontrivial data size. The standard optimisation: each peer computes a Merkle tree over its keyspace; the two sides exchange root hashes, descend recursively into mismatching subtrees, and transfer only the differing leaves. "Techniques such as checksum, recent update list, and Merkle tree can be used to identify the differences between nodes to avoid transmission of the entire dataset and reduce network bandwidth usage" (Source: sources/2023-07-16-highscalability-gossip-protocol-explained).

Seen in

Last updated · 319 distilled / 1,201 read