CONCEPT Cited by 1 source
Partition strategy¶
Partition strategy is the umbrella term for "how a sharded database decides which rows are stored together and on which server". Justin Gage's canonical naming (Source: sources/2026-04-21-planetscale-what-is-database-sharding-and-how-does-it-work):
"How you decide to split up your data into shards – also referred to as your partition strategy – should be a direct function of how your business runs, and where your query load is concentrated."
The phrase sits above the three/four concrete strategies on the wiki (hash, range, lookup / directory-based, plus custom) as the choice-level noun — the decision the team makes before wiring up the mechanics.
Load-bearing property: "direct function of how your business runs"¶
The strategy isn't a generic architectural preference — it's derived from the dominant query-load shape:
- B2B SaaS where every user belongs to an organization → partition by
org_id(related tables colocate; per-org queries go to one shard). - Consumer with no meaningful clustering → hash on a unique ID (Gage's worked example: Amazon's
order_id). - Single-tenant verticals with large known tenants → directory / lookup-based with operator-authored placement.
Gage's Notion worked example: "Notion manually sharded their Postgres database by simply splitting on team ID." The team-ID shard key is a direct projection of Notion's data model — every row belongs to a team; most queries are team-scoped; collocation is automatic.
Three canonical algorithms (Gage's list)¶
Gage names three partition-strategy algorithms (vs Dicken's four — Dicken adds a "custom" escape hatch):
| Algorithm | Row→shard mapping | Trade-off |
|---|---|---|
| Hash-based (key-based) | bucket(hash(col)) |
Even distribution; range-scan scatter-gather |
| Range-based | Pre-assigned value ranges | Range-scan efficient; hotspot risk |
| Directory-based (= lookup) | Operator-authored mapping table | Arbitrary placement; extra hop + consistency burden |
"Directory-based" is Gage's name for what Dicken (sources/2026-04-21-planetscale-database-sharding) and Guevara (sources/2026-04-21-planetscale-sharding-strategies-directory-based-range-based-and-hash-based) call lookup sharding. Guevara's post uses Gage's directory-based terminology verbatim; the two names are synonyms on this wiki.
Query-load distribution is the selection input¶
"If your sharding scheme isn't random (e.g. hash based), you can begin to see why query profiling and understanding how your load is distributed can be useful." (Source: sources/2026-04-21-planetscale-what-is-database-sharding-and-how-does-it-work)
Hash sharding's strength is that it doesn't require knowing the load distribution in advance — the hash smears both data and load across shards regardless of the shard-key value distribution. Range and directory strategies, conversely, are only as good as the operator's prior on how load is actually distributed. When your prior is wrong (or drifts), range/directory strategies concentrate load on whichever shard ended up owning the popular keys — exactly the Figma snowflake-ID hotspot and Dicken's age-range hotspot.
"As simple or as complicated as you make it"¶
Gage's closing framing on strategy choice:
"All of this is to say that sharding can be as simple or as complicated as you make it."
Notion's team-ID split is one end of the spectrum (one column, no lookup table, no hash); Uber Schemaless / TAO-style directory-per-object-type sit at the other (operator-authored placement at object-class granularity). The choice of strategy sets the ceiling on how much operator work the cluster demands over time.
Seen in¶
- sources/2026-04-21-planetscale-what-is-database-sharding-and-how-does-it-work — Justin Gage (guest post for PlanetScale, 2023-04-06) canonicalises the term "partition strategy" as the umbrella noun for the three algorithms (hash / range / directory-based) and names the selection input: "a direct function of how your business runs, and where your query load is concentrated." Notion's team-ID worked example is the canonical minimalist choice; the Amazon
order_idexample is the canonical high-cardinality-hash choice. - sources/2026-04-21-planetscale-database-sharding — Ben Dicken's four-strategy enumeration (range / hash / lookup / custom) is the mechanics-altitude complement to Gage's business-altitude framing; Dicken treats the same choice space, adds the custom-function escape hatch.
- sources/2026-04-21-planetscale-sharding-strategies-directory-based-range-based-and-hash-based — Holly Guevara uses "directory-based" terminology verbatim and walks the operator-tuned-resharding loop that keeps each strategy workable over time.