CONCEPT Cited by 2 sources
Cross-shard query¶
A cross-shard query is any query whose execution requires data from more than one physical shard in a horizontally-sharded database. The router must contact each involved shard and combine their results. Ben Dicken's framing (Source: sources/2026-04-21-planetscale-database-sharding):
"Cross-shard queries are bad for system performance, and should be avoided whenever possible. When multiple shards need to fulfill a single query, this adds excessive network and CPU overhead to the database cluster."
Cross-shard vs scatter-gather¶
Cross-shard is the general phenomenon; scatter-gather is a specific shape of it:
- Cross-shard query — involves ≥2 shards. Could be 2 (a pointed two-shard join), could be all of them.
- Scatter-gather query — specifically fans out to every shard, then aggregates the results back. A worst-case cross-shard query.
A query that routes to 2 of 128 shards is cross-shard but not scatter-gather; a query without a shard-key predicate becomes scatter-gather. The terminology on this wiki reflects the two sources of the framing: Dicken names the general case (cross-shard); Figma / the Vitess community name the fan-out shape (scatter-gather). Same underlying cost structure at the extreme.
Why it's a scale cap, not just latency¶
Any query that touches every shard loads the cluster exactly like the same query on an unsharded database would — N shards each do the same work as a single node would have done, the router aggregates, and the total capacity for that query class is the single-node capacity, not N × single-node.
So scale-out works for the dominant query path only if that path routes to a single shard. Every cross-shard query fraction steals capacity from the horizontal-sharding promise. If cross-shard queries are 10% of the workload, 10% of the cluster capacity behaves as if the database were unsharded — and that fraction is the scale cap for cross-shard-heavy workloads.
Four canonical sources of cross-shard queries¶
- Shard-key-less predicates.
SELECT ... FROM users WHERE email = ?on a table sharded byuser_id— no shard-key filter, every shard is a candidate. - Range scans on a hashed shard key.
WHERE user_id BETWEEN 1 AND 1000under hash sharding — sequential keys hash to different shards, so the range scatters. - Joins across shard-key boundaries. Joining
users(sharded byuser_id) toorgs(sharded byorg_id) without co-location means the proxy fans out the join. - Global aggregations.
SELECT COUNT(*) FROM orders— every shard must contribute.
Dicken's worked example for (1) uses the steps table:
"What would happen if we had done range sharding on the
step_countcolumn. Try inserting some rows and then click select where id == 1 to get all of the entries for user with id = 1. As you can see, the SELECT query may have to spread across multiple shards to make this work!" (Source: sources/2026-04-21-planetscale-database-sharding)
The dominant query is "all steps for user X"; sharding steps by step_count (a low-cardinality, high-volatility column) forces that query across every shard. Re-sharding by user_id (query-pattern-aligned choice) collapses it to a single-shard query.
Mitigation — pick the shard key for the dominant query¶
The discipline for avoiding cross-shard queries is shard-key-aligned-with-query-pattern: choose the shard key so the most common query's predicate contains the shard key, so that query routes to exactly one shard. Every other query becomes cross-shard; that's the trade-off.
For queries that must be cross-shard, the system accepts the cost: a point-lookup by email gets a lookup Vindex; global aggregations are pre-computed or answered by an OLAP replica; cross-colo joins are either colocated or manually assembled at the application layer.
Seen in¶
- sources/2026-04-21-planetscale-grouping-and-aggregations-on-vitess — Andres Taylor's 2022-06-24 post canonicalises nested-loop join as the canonical cross-shard-query join shape when the two sides' shard keys don't align, and the aggregation-pushdown-under-join rewrite as the planner's mitigation for aggregate-over-cross-shard-join queries. The rewrite's win depends on group-collapsing factor: 1M LHS rows collapsed to 1000 groups → 1000× fewer RHS queries. Canonical wiki datum that the cost structure of cross-shard queries is what elevates decades-old query-optimisation algorithms (Galindo-Legaria & Joshi 2001) from marginal to load-bearing — the per-query RTT in the cross-shard regime is ~1000× the single-node regime, so a 1000× reduction in query count moves from "nice to have" to "critical."
- sources/2026-04-21-planetscale-database-sharding — Dicken's canonical naming of the general phenomenon; the
steps-sharded-by-step_countworked example; "Cross-shard queries are bad for system performance, and should be avoided whenever possible." - sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale — Figma's specific fan-out-shape framing (scatter-gather) of the same phenomenon, with the "your speedy database starts feeling more like a snail" analogy.
- sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex — lookup Vindex as the Vitess mitigation for secondary-attribute queries that would otherwise be cross-shard.