SYSTEM Cited by 3 sources
Dicer (Databricks auto-sharder)¶
Dicer is Databricks' open-sourced auto-sharder: a control plane that continuously and asynchronously updates a service's shard assignment in response to pod health, load, termination notices, and other environmental signals. It sits behind "every major Databricks product" and is explicitly positioned in the lineage of Google's systems/slicer, Microsoft's systems/centrifuge, and Meta's systems/shard-manager. Open-sourced January 2026 at https://github.com/databricks/dicer.
Model¶
- Key — application-level id (user id, tenant id, chat-room id, …).
- SliceKey — hash of the Key; the uniform space Dicer operates on.
- Slice — contiguous range of SliceKeys.
- Resource — a pod (or equivalent compute unit).
- Assignment — partition of the full SliceKey space into Slices, each mapped to one or more Resources. The Assigner mutates Assignments via splits, merges, replications, dereplications, and moves — always minimal adjustments, not full reshuffles.
- Target — a sharded application served by Dicer. The Assigner is multi-tenant and region-scoped; one Assigner serves many Targets within a region.
Components¶
- Assigner — the controller service. Runs the sharding algorithm; consumes health + per-key-load signals; publishes Assignments.
- Slicelet (server-side library) — embedded in application pods. Caches the current Assignment, watches the Assigner for updates, notifies the app via listener API. Also records per-key load locally and reports summaries asynchronously. Both paths are off the request critical path.
- Clerk (client-side library) — embedded in clients. Caches the Assignment so
Clerk.lookup(key) → podis a local call with no RPC.
Consistency model¶
Assignments observed by Slicelets and Clerks are eventually consistent. The team frames this as a deliberate availability-and-recovery-speed choice over strong key-ownership — see concepts/eventual-consistency. systems/slicer and systems/centrifuge chose stronger lease-based ownership; Dicer's paper explicitly says "we do plan to support stronger guarantees in the future."
Primitive capabilities¶
- Dynamic rebalancing — slices moved off overloaded or draining pods before failure.
- Hot-key isolation + replication — a single hot key gets its own one-key slice, and that slice is assigned to multiple pods to split load. See patterns/shard-replication-for-hot-keys and concepts/hot-key.
- State transfer — migrate per-slice application state between pods during resharding (used by Softstore to preserve ~85% cache hit rate across rolling restarts, vs. ~30% drop without).
- Graceful restart / autoscale — assignment reacts to termination notices before the pod is gone, avoiding split-brain and traffic loss.
Canonical use cases¶
(From the open-source announcement.)
- In-memory and GPU serving — KV stores with sub-ms local reads; LLM per-session KV cache affinity; LoRA-adapter placement on constrained GPUs.
- Control and scheduling systems — cluster managers, query orchestration engines. Local state + multi-tenant routing.
- Remote caches — systems/softstore.
- Work partitioning / background work — non-overlapping keyspace ownership for GC / cleanup.
- Batching & aggregation on write paths — route related records to the same pod so the pod can batch in memory.
- Soft leader selection — see concepts/soft-leader-election.
- Rendezvous / coordination — chat rooms, multi-client session coordination.
Production case studies¶
- systems/unity-catalog — previously stateless, hit DB-read wall. Remote cache rejected (needed incremental, snapshot-consistent updates against gigabyte-scale customer catalogs). With Dicer: sharded in-memory cache, 90–95% hit rate, drastic DB-call reduction.
- SQL Query Orchestration Engine — previously in-memory + static sharding. Manual resharding was toilsome and rolling restarts caused availability dips. Post-Dicer: zero-downtime restarts/scale, chronic CPU throttling resolved by dynamic load balancing.
- systems/softstore — distributed KV cache built on Dicer, uses state transfer to ride rolling restarts (99.9% of planned restarts) with ~85% hit rate vs ~30% drop without.
Why it exists (motivating context)¶
Two dominant architectures at Databricks both had failure modes (Source: sources/2026-01-13-databricks-open-sourcing-dicer-auto-sharder):
- concepts/stateless-compute + remote cache: DB hit per request, network tax, (de)serialization CPU, and overread waste — fetching whole objects to use a fraction.
- concepts/static-sharding (e.g. consistent hashing): memory-local and fast, but suffers restart-window downtime, concepts/split-brain, and cannot solve concepts/hot-key.
Dicer's claim is that concepts/dynamic-sharding as a primitive gets the memory-locality benefit of (2) while fixing its three structural flaws.
Seen in¶
- sources/2026-05-27-databricks-reliable-llm-inference-at-scale — LLM-router-substrate use case. Dicer's third canonical Databricks production face: powering Axon, the Databricks LLM data-plane router, at 125T+ tokens/month scale across frontier OS (Kimi, Qwen) + proprietary (OpenAI, Gemini, Claude) models. Two structural integrations specific to this face: (1) Load metric switched from active-requests to model units: "We integrated model units with Dicer so that routing decisions are based on server load in model units rather than traditional request-based heuristics." This makes Dicer the canonical-wiki implementation of cost-based load balancing for LLM — non-uniform request cost handled at the load-metric layer rather than at the routing-algorithm layer. (2) Stateful sessions = sticky workload-to-subset binding: "Dicer also provides stateful sessions, making request routing sticky. A workload's requests go to only a subset of servers, which improves cache hit rates (crucial for latency-sensitive workloads like coding agents) and limits blast radius." This productionises the Dicer "In-memory and GPU serving" canonical use case (LLM per-session KV cache affinity) at platform scale, with two-purpose framing (cache-hit-rate and blast-radius simultaneously). Verbatim on the cost asymmetry: "a small number of expensive long-context requests can trigger different routing and scaling decisions than many cheap short requests." See patterns/stateful-llm-session-routing for the LLM-application pattern.
- sources/2026-05-05-databricks-10-trillion-samples-a-day-scaling-beyond-traditional-monitoring — Stateful-aggregator sharding use case. Dicer's second canonical Databricks production use case: powering the Telegraf metric-aggregation tier that shields Pantheon from cardinality growth. The load-bearing property is sticky routing — metric series stay pinned to the same aggregator across redeployments so in-memory running state (counters, percentile reservoirs, histogram buckets) survives. Databricks explicitly rejected a Kafka-backed partitioning alternative ("costly at our scale and adds ingestion delay that impacts real-time usecases") in favour of this sticky-routing-with- minimal-reassignment model. Scaled to >1 GB/s aggregation throughput in the largest region across thousands of aggregation rules. See patterns/sticky-routing-for-aggregator-state for the full trade-off pattern. Canonical instance of Dicer's "Batching & aggregation on write paths" use case from the original open-source announcement (sample-routing keyed on metric series → colocates samples-for-the-same-series at the same aggregator pod so the pod can aggregate in memory).
- sources/2026-01-13-databricks-open-sourcing-dicer-auto-sharder — announcement + motivation + use-case catalogue + three case studies (Unity Catalog, SQL Query Orchestration, Softstore).
Related¶
- systems/databricks-axon — the LLM data-plane router built on Dicer; Dicer's third canonical Databricks face.
- systems/slicer / systems/centrifuge / systems/shard-manager — prior art.
- concepts/dynamic-sharding
- concepts/static-sharding
- concepts/hot-key
- concepts/split-brain
- concepts/eventual-consistency
- concepts/soft-leader-election
- concepts/control-plane-data-plane-separation — Assigner (decide) vs Slicelet/Clerk (deliver).
- patterns/shard-replication-for-hot-keys
- patterns/state-transfer-on-reshard
- systems/unity-catalog / systems/softstore — case studies.