PlanetScale — Database sharding¶
Summary¶
Ben Dicken (PlanetScale, 2025-01-09) publishes an interactive primer on database sharding — a pedagogical post but architecturally substantive: it canonicalises the four production sharding strategies (range / hash / lookup / custom), the three shard-key selection criteria (cardinality, volatility, query-pattern alignment), the proxy-tier pattern (VTGate-style router between application and shards), the cross-shard-query cost as a scale cap, the proxy latency tax (~1 ms when co-located; Slack reports 2 ms average query latency on a Vitess cluster as the upper-bound reference), the per-shard replica set for durability + HA, and the shard-parallel backup property (4 TB at 100 MB/s: 11 h unsharded → 2.7 h across four shards). The load-bearing framing:
"In a sharded database, we will have multiple separate database servers, each with a portion of the total data … [with application-side sharding] the code running on the application server has to be aware of all of the shards, know which rows are stored where, and keep a connection open to each. This is not a huge problem with only two shards, but it becomes complex when there are hundreds of them. Storing this logic in the application code can quickly become messy and difficult to maintain. A better option is to have the app servers connect to an intermediary server that we call a proxy."
That paragraph is the canonical wiki disclosure of why sharding requires a proxy tier once the shard count grows past a handful: the proxy owns the fan-out complexity and the application sees a single logical table. Vitess's VTGate is named as the canonical instance.
The post's four-strategy enumeration is the canonical wiki taxonomy of how rows map to shards:
-
Range sharding — pre-defined value ranges per shard. Three worked examples show why every obvious range choice on the retailer-toy schema (
user_id/name/age) produces uneven distribution: monotonic IDs pin all new inserts to the first shard, alphabetical name ranges leave v-z empty, age ranges concentrate on the 25–74 band. Range sharding only works when the value distribution is known and stable. -
Hash sharding — route on a cryptographic hash of the shard-key value; each shard owns a hash range. Dicken's framing: "similar inputs can produce very different outputs …
josephmight hash to 45,josephineto 28. Similar names, completely different hashes." The pay-off is even distribution without knowing the value distribution in advance. The trade-off: range scans on the shard key become scatter-gather because sequential keys hash to different shards. -
Lookup sharding — a developer-maintained mapping table that the proxy reads on each routable query. Strictly more flexible than range or hash (arbitrary key-to-shard mapping), at the cost of an extra lookup hop + keeping the lookup table consistent.
-
Custom sharding function — user-defined code that takes column(s) and returns shard(s). Vitess supports this as an escape hatch for domain-specific routing.
The shard-key selection criteria are canonicalised in three steps: (1) high cardinality — a unique-per-row column like user_id distributes better than a high-skew column like name; (2) low volatility — immutable keys (like user_id) never force row migration, while mutable keys (like step_count) force the row to move between shards on every update; (3) query-pattern alignment — shard so the dominant query (e.g. "all steps for user X" on a steps table) routes to a single shard; sharding the steps table on step_count instead of user_id forces that query to be cross-shard.
The proxy latency tax is named plainly — "Clearly, it takes longer!" — and the co-location mitigation is named with a number: "If the proxy and shards all live in the same data center, the added latency can be brought down to 1ms or less." The production reference is Slack: "Slack runs massive sharded database cluster with Vitess, and reports an average query latency of only 2ms."
The per-shard replica set is the durability + HA primitive — each shard has a primary + N replicas; Vitess's reparenting orchestration replaces failed primaries from the replica set. The shard-parallel backup calculation is made concrete: 4 TB at 100 MB/s/server is 11 h unsharded; 4 × 1 TB shards backing up in parallel finishes in ~2.7 h.
Key takeaways¶
-
The proxy tier is the structural fix for application-side shard awareness. "With [application-side sharding], the code running on the application server has to be aware of all of the shards, know which rows are stored where, and keep a connection open to each … A better option is to have the app servers connect to an intermediary server that we call a proxy." The proxy owns the fan-out complexity and the application sees a single logical table. Canonical wiki naming of the database-proxy-tier pattern with Vitess's
VTGateas the named instance: "In Vitess, these proxy servers are known asVitess Gates, orVTGatesfor short." -
Four production sharding strategies, not one. Range / hash / lookup / custom. Range works when value distributions are known and stable; hash is the general-purpose default; lookup trades an extra hop for arbitrary key-to-shard mapping; custom is the escape hatch. Vitess supports all four. "Software like Vitess also supports lookup sharding and even custom sharding functions."
-
Naive range sharding on monotonic IDs is a canonical hotspot antipattern. "The first 25 inserts all go to the first shard, leading to one hot shard … and three other cool shards. If we continue inserting, the same problem arises for all the other shards. Using naive range-based sharding with IDs is generally a bad idea if our IDs are monotonically increasing." Same phenomenon regardless of fleet size — the hotspot is structural, not capacity-solvable. Canonical wiki cross-ref to concepts/hot-key and to Figma's same observation on Snowflake-style timestamp-prefixed IDs (sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale).
-
Hash sharding's cost is scatter-gather on shard-key range scans. Even distribution is the pay-off; "range scans on the shard key become inefficient — sequential keys hash to different shards" is the trade-off. Sister framing to the Figma canonical statement. Accept this when shard-key range scans are rare (the common OLTP case); reject it when they're dominant (analytics workloads, time-series).
-
Shard-key criteria: cardinality + volatility + query-pattern alignment. Cardinality: a unique-per-row column (
user_id) distributes better than a skewed one (name). Volatility: "Always take time to consider thevolatilityof a column before selecting it as a shard key." A mutable column (step_count) forces row migration on every update; an immutable column (user_id) never does. Query-pattern alignment: "A better solution would be to shard based onuser_id. That way, all entries for a given user will live on the same shard, which will allow us to avoid this cross-shard query." Canonical three-axis framing; strict superset of the generic "pick something high-cardinality" advice. -
Cross-shard queries are a scale cap, not a latency cost. "Cross-shard queries are bad for system performance, and should be avoided whenever possible. When multiple shards need to fulfill a single query, this adds excessive network and CPU overhead to the database cluster." Dicken's framing sits alongside Figma's scatter-gather framing: both name the property that a query touching every shard loads the cluster like an unsharded database would, erasing the horizontal-sharding pay-off for that query. The shard key is chosen specifically to make the dominant query single-shard (patterns/shard-key-aligned-with-query-pattern).
-
Proxy tier costs one extra network hop — ~1 ms when co-located. "If the proxy and shards all live in the same data center, the added latency can be brought down to 1ms or less. For the vast majority of applications, adding 1ms is worth the scalability achieved with the sharded architecture." Production upper-bound reference: "Slack runs massive sharded database cluster with Vitess, and reports an average query latency of only 2ms." Canonical wiki framing of the proxy-tier latency tax as a cheap structural cost when the topology is right, expensive when the proxy is cross-region. The mitigation is co-location.
-
The proxy tier itself is a scaling axis. "The proxy server hit the capacity for simultaneous queries it could process, and had to queue up other inserts. This added latency would be unacceptable for a production database system. To get around this, we can add more proxy servers." Canonical wiki framing: proxies aren't a singleton — they're horizontally scaled alongside shards, and peak-traffic capacity planning has to account for both axes.
-
Per-shard replica sets give durability + HA independently at each shard. "Each shard has a single primary and then two additional replicas per-shard for data durability." Replicas give: (1) durability against single-server loss, (2) ability to fail over primary to a replica instead of "hours or even days of outage," (3) read scale-out. Vitess's orchestration "can automatically detect server failures and quickly replace downed primaries." Canonical wiki naming of the per-shard replica-set pattern — durability + HA are per-shard concerns, not cluster-wide; each shard's replica set is independent.
-
Sharding parallelises backups — backup time scales down with shard count. "[4 TB unsharded] backing up this database will take (4TB / 100MBps * 60s * 60m) = ~11 hours. That's a long time! Alternatively, what if this 4TB was spread out across 4 shards storing 1TB each? Each individual server can capture a backup simultaneously at 100MBps, allowing the cluster to back up the data at 400MBps, taking only ~2.7 hours." Canonical wiki numerical framing of the shard-parallel-backup property: backup throughput is N × per-server bandwidth where N is the shard count (pair-wise with the snapshot-plus-catchup-replication framing for restore). This is one of the "surprising benefits" Dicken links to the companion Three surprising benefits of sharding a MySQL database + faster backups with sharding posts (not ingested here — this post is the self-contained canonical treatment of the property).
-
Production instances named. Uber (Schemaless rewrite), Shopify (horizontally scaling the Rails backend of Shop App with Vitess), Slack (scaling datastores at Slack with Vitess), Cash App ("process millions of queries per second at peak hours" per the linked case study). "Your favorite companies like Uber, Shopify, Slack, and Cash App all use sharding with Vitess and MySQL to scale their massive databases." Canonical wiki one-line citation that Vitess + MySQL is the industry-standard production sharding substrate, not a PlanetScale-specific choice.
Systems / concepts / patterns surfaced¶
-
New systems: (none — systems/vitess, systems/mysql, systems/planetscale, systems/neki all canonical already).
-
New concepts (8):
- concepts/range-sharding — pre-defined value ranges per shard; works when value distribution is known and stable.
- concepts/hash-sharding — route on
hash(shard_key); even distribution at the cost of shard-key range scans. - concepts/lookup-sharding — developer-maintained mapping table consulted by the proxy; arbitrary key-to-shard flexibility at the cost of an extra lookup.
- concepts/shard-key-cardinality — unique-per-row columns distribute better than skewed ones; cardinality as the first criterion for shard-key selection.
- concepts/shard-key-volatility — mutable shard-key columns force row migration on update; pick immutable keys.
- concepts/cross-shard-query — any query requiring ≥2 shards; a scale cap, not just a latency cost (sibling of concepts/scatter-gather-query — Dicken's naming, Figma's naming, same phenomenon).
- concepts/proxy-tier-latency-tax — the extra hop costs ~1 ms co-located, more cross-region; canonical co-location mitigation.
-
concepts/shard-parallel-backup — backup throughput scales as N × per-server bandwidth.
-
New patterns (3):
- patterns/database-proxy-tier — the VTGate-style architectural move: proxy between application and shards owns routing, connection management, and makes shards appear as a single logical table. The structural fix for application-side shard awareness.
- patterns/shard-key-aligned-with-query-pattern — choose the shard key so the dominant query predicate routes to a single shard; the discipline that prevents cross-shard queries from dominating traffic.
-
patterns/per-shard-replica-set — each shard has its own primary + replicas; durability + HA are per-shard concerns, composed over all shards rather than defined at cluster scope.
-
Extended:
- systems/vitess — first canonical wiki pedagogical treatment of
VTGateas the proxy-tier instance (companion to the internals-altitude disclosures across sources/2026-04-21-planetscale-connection-pooling-in-vitess + consistent-lookup-vindex + throttler series). - systems/mysql — canonical wiki citation of Vitess + MySQL as the industry-standard production sharding substrate (Uber / Shopify / Slack / Cash App).
- systems/planetscale — new Seen-in for the interactive-primer + Neki-waitlist call-out.
- concepts/shard-key — extended with the three-criterion framing (cardinality / volatility / query-pattern alignment) as a strict superset of the existing Figma-derived cardinality-only framing.
- concepts/horizontal-sharding — extended with the proxy-tier sub-architecture framing + the four-strategy taxonomy.
- concepts/hot-key — new Seen-in for Dicken's monotonic-ID-range-shard hotspot instance; matches the Figma framing.
- concepts/scatter-gather-query — new Seen-in + sister-framing with concepts/cross-shard-query (Dicken names the general phenomenon; Figma names the fan-out-then-aggregate shape).
- concepts/consistent-hashing — new Seen-in for the hash-sharding derivation.
Operational numbers¶
- ~1 ms proxy latency tax when proxy + shards are co-located in the same data center.
- 2 ms — Slack's reported average query latency on their Vitess sharded cluster (production reference; cites https://slack.engineering/scaling-datastores-at-slack-with-vitess/).
- 100 MB/s — the assumed per-server backup bandwidth in the worked example.
- 4 TB / 100 MB/s ≈ 11 h — unsharded-database full-backup duration.
- 4 × 1 TB shards / 400 MB/s ≈ 2.7 h — shard-parallel-backup duration for the same total size.
- Proxy + shards capacity-planning signal: "the proxy server hit the capacity for simultaneous queries it could process, and had to queue up other inserts" — peak-traffic sizing must include proxy count, not just shard count.
- Named production users: Uber, Shopify, Slack, Cash App — all on Vitess + MySQL sharding.
Caveats¶
- Pedagogical voice, not engineering deep-dive. The post is built around interactive diagrams (clickable cluster visualisations, animation-speed controls, insert/select demo buttons) with a textbook narrative; it does not disclose Vitess-internals mechanics (how VTGate actually planner-rewrites a query, how consistent hashing is implemented in Vitess vs lookup Vindexes, how the proxy's own connection pool interacts with the per-shard primaries, how reparenting works under load). Those are covered in separate internals-altitude ingests on the wiki (sources/2026-04-21-planetscale-connection-pooling-in-vitess, sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex, consensus parts 4–7).
- Single worked numerical example (the backup calculation). The ~1 ms proxy latency and 2 ms Slack latency are "typical" values, not benchmarked; the 100 MB/s per-server backup bandwidth is assumed, not measured. No distributions, no p99, no scaling-curve data.
- "Cryptographic hash" is imprecise. Dicken's simplified framing is "we generate a cryptographic hash of this value … an algorithm that always produces a hash between 0 and 100" — real Vitess hash Vindexes don't use cryptographic hashes for routing; they use fast non-cryptographic mixers (keyspace_id =
binary(8)per concepts/keyspace-id). Cryptographic-hash framing is pedagogical simplification, not engineering accuracy; future readers should consult concepts/vindex + concepts/keyspace-id for the production mechanism. - Four-strategy taxonomy is clean but not exhaustive. Real sharding systems layer strategies (hash-routing with lookup-Vindex overrides; range-sharding with hash-sub-sharding within a range; geographic sharding for data-residency regulation — concepts/geographic-sharding). The Dicken enumeration is the pedagogical baseline; the systems/vitess Vindex family is the production superset.
- "1 ms proxy latency" assumes a healthy co-located topology. Cross-region proxy (e.g. app in one AZ, proxy-plus-shards in another, with AZ-level network partitions) can easily push the tax to 5–20 ms; the post does not acknowledge this bound. The concepts/proxy-tier-latency-tax wiki page carries the caveat.
- Per-shard replica set example (1 primary + 2 replicas) is a Vitess default, not a universal rule. Replica count per shard is a per-tenant configuration in Vitess (VDiff requires ≥1 idle replica for verification); 1 + 2 is a reasonable default, not an architectural constant.
- Resharding is named but not described. "As the amount of data grows, we can add more shards to support it. This is a process known as resharding." The mechanics (how existing rows migrate, what snapshot+catchup does, cut-over semantics, rollback) are canonicalised elsewhere on the wiki (sources/2026-02-16-planetscale-zero-downtime-migrations-at-petabyte-scale, patterns/routing-rule-swap-cutover) — this post only names the primitive.
- No production incident content. The post mentions that "some organizations spend months transitioning from an unsharded to sharded architecture" without a specific case study. No failure mode walkthrough, no split-brain during proxy failure, no reshard-mid-outage narrative.
- PlanetScale product CTA at the tail + Neki waitlist. "Looking for sharded Postgres? We're building Neki — sharded Postgres by the team behind Vitess." The marketing posture is mild and appended to a substantive architectural narrative; this is the standard Dicken-post shape on PlanetScale's blog (see the PlanetScale company page for the Dicken-is-architecture-dense skip-rule exception).
Source¶
- Original: https://planetscale.com/blog/database-sharding
- Raw markdown:
raw/planetscale/2026-04-21-database-sharding-c252ed8b.md
Related¶
- systems/vitess — the production substrate;
VTGatenamed as the canonical proxy-tier instance. - systems/mysql — the storage engine behind every named production user (Uber / Shopify / Slack / Cash App).
- systems/planetscale — the vendor voice; Ben Dicken pedagogy-author venue.
- systems/neki — sharded Postgres in waitlist; the post's closing pointer.
- concepts/horizontal-sharding — the parent concept this post is the interactive primer for.
- concepts/shard-key — extended with the three-criterion framing.
- concepts/range-sharding / concepts/hash-sharding / concepts/lookup-sharding — the three named strategies with canonical pages.
- concepts/shard-key-cardinality / concepts/shard-key-volatility — the shard-key-selection criteria.
- concepts/cross-shard-query / concepts/scatter-gather-query — Dicken's general framing + Figma's fan-out-shape framing of the same phenomenon.
- concepts/proxy-tier-latency-tax — the ~1 ms co-located cost + Slack's 2 ms upper-bound reference.
- concepts/shard-parallel-backup — the N × per-server-bandwidth property.
- concepts/hot-key — Dicken's monotonic-ID instance sits alongside Figma's Snowflake-ID instance.
- concepts/consistent-hashing — the algorithmic primitive under hash sharding.
- patterns/database-proxy-tier — the VTGate-style architectural move canonicalised.
- patterns/shard-key-aligned-with-query-pattern — the discipline that prevents cross-shard queries from dominating.
- patterns/per-shard-replica-set — the per-shard durability + HA primitive.
- sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale — the Figma shard-key-selection canon; same phenomenon framed from the receiving-team side.
- sources/2026-04-21-planetscale-connection-pooling-in-vitess — Vitess-internals altitude on the pool primitive that complements the routing primitive disclosed here.
- sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex — Vitess-internals altitude on the Vindex routing machinery that the "hash sharding" strategy actually implements.
- sources/2026-02-16-planetscale-zero-downtime-migrations-at-petabyte-scale — the resharding mechanics this post names but doesn't describe.
- companies/planetscale — company page.