Skip to content

SYSTEM Cited by 1 source

Softstore (Databricks distributed KV cache)

Softstore is Databricks' distributed remote KV cache for services that aren't themselves sharded. It is built on systems/dicer and is the canonical example of Dicer's state-transfer feature in action.

Why Softstore exists

Not every service can or should restructure into a sharded stateful model. Softstore offers a cache tier those services can call into while benefiting from Dicer's auto-sharding capabilities (load-balancing, hot-key handling, graceful restart handling) under the hood.

State transfer during rolling restarts

  • 99.9% of restarts in the Databricks fleet are planned rolling restarts โ€” the full keyspace churns.
  • Without state transfer, a rolling restart drops cache hit rate by ~30% (every evicted pod throws away its cache, successor pod starts cold).
  • With state transfer: Dicer migrates per-slice state between pods across resharding, preserving a steady ~85% hit rate through the restart for a representative use case.

The architectural pattern is named as patterns/state-transfer-on-reshard โ€” a cache (or any per-key-state service) that reshards across restarts without re-warming.

Seen in

Last updated ยท 200 distilled / 1,178 read