CONCEPT Cited by 1 source
Logical vs physical sharding¶
Logical sharding and physical sharding are two different states in the horizontal-sharding rollout. Decoupling them — doing logical before physical, and being able to roll back each independently — is the central de-risking move in Figma's in-house horizontal-sharding effort (Source: sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale).
Definitions¶
- Logical sharding: the application layer issues reads and writes as if the table is horizontally sharded (sharded connection poolers, shard-key routing, sharded-query-subset semantics, per-shard query execution) — but the data is still physically on a single database host.
- Physical sharding: the physical 1→N failover that moves data from one host to N hosts. Real shard infrastructure, real network partitions to worry about, real replication lag to manage.
Why decouple¶
| Risk axis | Logical rollout | Physical rollout |
|---|---|---|
| Reliability / latency / consistency visible to users | Same behavior as post-shard (so you can observe real impact) | 1→N failover; new failure modes like partial-shard-op success |
| Rollback | Configuration change back to the main table, seconds | Possible, but requires complex cross-shard coordination |
| Percentage rollout | Feature-flag-gated in DBProxy, arbitrary percentage | Binary (a given table is either pre- or post-shard) |
| Data movement | None | Full replication copy to N hosts |
| Developer time to fix a bug | Ship a query-engine fix, re-enable | Potentially involves coordinating replication state |
Rolling out logical first means the riskiest behavioral changes (sharded query execution, scatter-gather plans, topology lookups, DBProxy query-engine bugs) get caught and fixed on an unsharded physical layout — so when the physical failover happens, it's running code that's already been burned in under real traffic.
Implementation — Postgres views as the logical-shard representation¶
Figma explored three representations:
- Separate Postgres databases per logical shard — required the physical data move up front ⇒ no decoupling, so rejected.
- Separate Postgres schemas per logical shard — same problem as (1) ⇒ rejected.
- Postgres views per logical shard (chosen):
CREATE VIEW table_shard1 AS SELECT * FROM table WHERE hash(shard_key) >= min_range AND hash(shard_key) < max_range;for each shard. All reads and writes go through the views via per-shard connection poolers — which still point at the same unsharded physical instance. Behavior is sharded end-to-end; the data hasn't moved.
Canonical patterns/sharded-views-over-unsharded-db. Performance risks of views (plan-change under the query planner) validated against a sanitized production query corpus + a shadow-reads framework comparing view-based vs direct-table execution → confirmed <10% worst-case overhead.
Physical sharding comes after¶
Once the logical-sharded topology has been trusted in production (feature-flag rollout to 100%, enough time for cross-shard edge cases to surface), the physical failover is the only remaining risk. Much of the 1→N failover reuses the 1→1 failover machinery built for concepts/vertical-partitioning — so Figma's years of vertical-partitioning experience accelerated this step. First physical failover: 10 seconds of partial primary availability, no replica impact (Source: sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale).
Sibling pattern: patterns/shadow-migration¶
Both ideas run new semantics against old infrastructure to de-risk. Difference in orientation:
- Shadow migration: old and new engines both execute; new engine's output is compared to old's.
- Logical-vs-physical sharding: only the new engine executes, but on old infrastructure — the infrastructure is the shadow, not a parallel engine.
Seen in¶
- sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale — Figma's canonical instance: per-shard Postgres views against an unsharded physical database + DBProxy with a feature-flagged percentage rollout of sharded semantics; shadow reads quantify view overhead.