CONCEPT Cited by 1 source
Range sharding¶
Range sharding routes each row to a shard by whether its shard-key value falls inside a pre-defined range assigned to that shard. The router holds a small table of (shard_id, lower_bound, upper_bound) tuples and dispatches queries accordingly. One of the four production sharding strategies enumerated by Ben Dicken (Source: sources/2026-04-21-planetscale-database-sharding), alongside hash, lookup, and custom.
When it works¶
Range sharding works when the value distribution across the shard-key column is known and stable — the ranges can be drawn so each shard receives a roughly equal share of both data and traffic, and the distribution doesn't drift such that the ranges become unbalanced over time.
- Geographic sharding (
country_code) when tenant sizes are known. - Tenant sharding with large tenants given dedicated ranges.
- Time-range sharding on historical-only tables where current writes land in one shard by design (analytics-read + archival-write).
When it fails — three canonical hotspots¶
Dicken's primer walks three obvious range-sharding choices on a retailer toy schema, each producing uneven load (Source: sources/2026-04-21-planetscale-database-sharding):
-
Monotonically increasing IDs — "The first 25 inserts all go to the first shard, leading to one hot shard … and three other cool shards. If we continue inserting, the same problem arises for all the other shards." The active write-frontier pins to a single shard; the rest are cold. Canonical wiki concepts/hot-key instance; Figma names the same phenomenon on Snowflake-style timestamp-prefixed IDs (sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale).
-
Alphabetical name ranges — "None of our users have names in the v-z range, leading to a wasted shard. Such a sharding solution only works well if our users have names that are perfectly evenly distributed across the alphabet. This is rarely true in practice." Real-world string distributions are non-uniform (Zipfian over first letter); ranges derived from a uniform prior are always skewed.
-
Age ranges — "The vast majority of our users are between 25-74 years of age. Two of our shards are hot with lots of traffic while the other two are quite cold." And the distribution drifts: today's working-age shard is tomorrow's retiree shard. Range sharding on a column whose distribution shifts over time is an ongoing rebalancing burden.
Trade-off vs hash sharding¶
| Property | Range sharding | Hash sharding |
|---|---|---|
| Range scans on shard key | Efficient — sequential keys on one shard | Inefficient — sequential keys scatter |
| Even distribution without prior knowledge | No — requires knowing the distribution | Yes — property of the hash |
| Handles monotonic IDs | Bad — active frontier hotspot | Good — hash smears across shards |
| Handles skewed value distributions | Bad — ranges match the skew | Good — hash flattens skew |
| Rebalancing when distribution drifts | Required, ongoing | Not required for the routing itself |
Range sharding wins on range-scan queries; hash wins on distribution robustness. The choice depends on whether shard-key range scans are dominant (favour range) or rare (favour hash).
Vitess context¶
Vitess supports range sharding via its Vindex framework — each shard owns a range of keyspace_id values, and a range-based Primary Vindex produces contiguous keyspace_id outputs. Most production Vitess deployments use a hash Primary Vindex (the shard-key values are hashed before the keyspace_id is computed), giving hash-sharding semantics on top of Vitess's range-addressed shard fabric. Range sharding at the logical level (on unhashed shard-key values) is an explicit configuration choice, not the default.
Seen in¶
- sources/2026-04-21-planetscale-database-sharding — canonical pedagogical treatment; three hotspot instances (monotonic IDs, alphabetical names, age ranges) each walking why an obvious range choice fails.
- sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale — Figma names the Snowflake-ID hotspot as the reason they hash the shard key rather than range-shard on it.
- — Holly Guevara (PlanetScale, 2024-07-08) canonicalises the operator-tuned resharding loop that keeps range sharding viable against skew: "if we started with this method for mapping the ranges, but then quickly noticed that shard 1 was growing much faster than the others, we may decide to break shard 1 up into multiple shards with a reshard operation. We can also make the range for the last shard larger since those letters happen less frequently." Complementary framing to Dicken's "range sharding only works when the value distribution is known and stable" — Guevara names the widen-cold-range + split-hot-range corrective loop as the operational posture that makes range sharding workable when the distribution isn't known in advance.