Skip to content

PATTERN Cited by 1 source

Virtual nodes for load balancing

Definition

Virtual nodes (vnodes) assign each physical node multiple positions on the hash ring by hashing its identifier through distinct hash functions. Instead of one node owning one contiguous arc, each physical node owns many small arcs scattered around the ring (Source: sources/2023-02-22-highscalability-consistent-hashing-algorithm).

Why plain consistent hashing isn't enough

Raw consistent hashing places each node at exactly one ring position. The arc length a node owns is random — driven by where its identifier happened to hash. With few nodes, variance is high: one node can end up owning a much larger arc and absorb a proportionally larger share of keys. Source framing:

"There is a chance that nodes are not uniformly distributed on the consistent hash ring. The nodes that receive a huge amount of traffic become hotspots resulting in cascading failure of the nodes."

Virtual nodes turn the distribution from "one sample per node" into "k samples per node, averaged" — variance drops as 1/√k by the law of large numbers.

Capacity-proportional allocation for heterogeneity

Because each physical node's ring contribution is the sum of its vnodes, you can give a bigger node more vnodes and it will proportionally own more keys. Source:

"The number of positions for a node is decided by the heterogeneity of the node. In other words, the nodes with a higher capacity are assigned more positions on the hash ring."

Drawbacks (from the source)

  • Hot keys still concentrate — vnodes fix non-uniform distribution, not non-uniform popularity. patterns/bounded-load-consistent-hashing is the complementary mitigation for hot keys.
  • Capacity planning is trickier — per-vnode load is harder to reason about than per-node load.
  • Memory and operational complexity rise — the BST over ring positions now has N × k entries to maintain.
  • Replication logic is harder — replicas must land on distinct physical nodes, so the ring walker has to skip vnodes that belong to the same physical host.
  • Failure amplification — one physical-node outage removes all of its vnodes from the ring simultaneously; each vnode's clockwise neighbour absorbs its keys, and those neighbours might all be the same small set of hosts.

Typical vnode multipliers

The source doesn't quote specific numbers. Published deployments span a wide range:

  • Dynamo / Cassandra / Riak — dozens to hundreds of vnodes per physical node (Cassandra default: 256).
  • Memcached Ketama — typically 40 vnodes per server ("replica" count in libketama terminology).

Too few → variance persists; too many → BST bloat and rebalance thrash.

Seen in

Last updated · 517 distilled / 1,221 read