CONCEPT Cited by 1 source
Scatter-gather query¶
A scatter-gather query is a query executed in a sharded system that cannot be routed to a single shard (because its predicate doesn't include the shard key or because it needs data from multiple shards). The router fans it out to every shard (scatter phase), then aggregates the results back (gather phase).
Figma's DBProxy framing (Source: sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale):
Think of scatter-gather like a database-wide game of hide-and-seek: You send out your query to every shard (scatter), then piece together answers from each (gather). Fun, but overdo it, and your speedy database starts feeling more like a snail, especially with complex queries.
When scatter-gather happens¶
- Shard-key-less predicates —
WHERE status = 'active'on a table sharded byuser_id. - Range scans on the shard key when the shard key is hashed —
hash(shard_key)routing scatters sequential keys across all shards. - Global aggregations —
COUNT(*)over a whole table,SUMacross orgs. - Joins across colos or joins not on the shard key.
- Nested SQL / correlated subqueries whose outer and inner queries don't share the same shard-key filter.
Why it's a scale cap, not just a latency cost¶
Every scatter-gather touches every shard → the load profile of one SG call is identical to running the same query on an unsharded database. More shards don't help. If scatter-gather is a large fraction of traffic, horizontal sharding's core promise — more shards = more capacity — stops holding. The sharded database asymptotes to single-node performance as the SG fraction grows.
This is the reason Figma's DBProxy deliberately restricts the sharded-query language rather than supporting full SQL compatibility. Three kinds of costs pushed that decision:
- Query-engine complexity. Aggregations, cross-colo joins, and nested SQL with SG execution plans all require custom implementation in the router — DBProxy would "begin to look a lot like the Postgres database query engine."
- Scale cap. Too many SGs invalidate the shard count as a scalability knob.
- Application refactor burden. Every unsupported query is product code to rewrite.
Choosing the supported-query subset¶
Figma's answer — patterns/shadow-application-readiness — logged query plans against live production traffic to empirically identify the subset covering 90% of queries while excluding the worst-case engine-complexity patterns. Resulting restriction:
- All range scans and point queries allowed.
- Joins only when joining two tables in the same colo on the shard key.
Queries outside the subset must be rewritten — usually by adding a shard-key predicate, denormalizing across colos, or moving the aggregation to an offline / asynchronous path.
Relation to the Redis-proxy flavor¶
FigCache (systems/figcache) implements a structurally similar "fanout filter engine" that transparently scatters read-only multi-shard pipelines across Redis Cluster shards to sidestep CROSSSLOT errors. Same primitive — fan-out + aggregate — at the wire-protocol-proxy level instead of the SQL query-engine level.
Mitigations¶
- Push down predicates aggressively — any predicate the planner can resolve per-shard reduces network return size.
- Parallel scatter — fan out concurrently to overlap per-shard latencies; end-to-end latency ≈ slowest shard.
- Hedged requests against slow shards (implemented in DBProxy — Source: sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale).
- Load-shedding on SG-heavy paths to protect non-SG traffic.
- Pre-aggregation into a summary table (often sharded the same way as the facts) to convert a common SG into a single-shard lookup.
Seen in¶
- sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale — DBProxy's logical planner distinguishes single-shard from scatter-gather plans; the sharded-query subset is explicitly chosen to avoid worst-case SG complexity; shadow-application-readiness picks the subset from live traffic.
- sources/2026-04-21-figma-figcache-next-generation-data-caching-platform — FigCache's fanout filter engine scatters read-only multi-shard Redis pipelines in parallel.