CONCEPT Cited by 1 source
Horizontal sharding¶
Horizontal sharding splits a single logical table (or group of related tables) so that its rows live across multiple physical database instances. Each row is assigned to a shard by a function of its shard key (commonly hash(shard_key) → shard_id). Orthogonal to vertical partitioning (which splits different tables across databases, keeping each whole table intact on one instance).
Why take this on¶
- Vertical partitioning ceiling: once an individual table exceeds the host DB's practical limits (TB-scale heap / billions of rows / vacuum stalls / IOPS ceilings), vertical partitioning can't help — the smallest unit of vertical partitioning is a single table (Source: sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale).
- Transparent future scale-outs: once a table is horizontally sharded at the application layer, it supports any number of physical shards; future scale-outs are a shard-split operation with no application-level change.
What you give up vs a single-node relational DB¶
- SQL feature set shrinks. Foreign keys, globally unique indexes, and certain join / aggregation / nested-SQL patterns become inefficient or infeasible to enforce across shards. Unique indexes often only survive if they include the shard key.
- Schema changes must coordinate across all shards.
- Transactions can span multiple shards → partial commit failures become a product-visible failure mode (the canonical illustration: "moving a team between two orgs, only to find half their data was missing" — Source: sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale).
- Atomic cross-shard transactions require either a distributed-transaction protocol or product-layer compensation logic; many implementations (like Figma) deliberately skip them and make the product resilient to cross-shard failures.
- Queries without a shard-key filter become scatter-gather — fan out to every shard, aggregate back. Expensive to implement correctly and a scale cap (each SG touches every shard → same load as unsharded).
Two axes to design along¶
- Shard-key selection (concepts/shard-key). One key across all tables vs a small set of keys per-domain (Figma's choice:
UserID,FileID,OrgID— each covers almost every table; tables sharing a key are grouped into colos). One-key-fits-all often forces a composite key + backfill + app refactor; per-domain keys require a colocation scheme but minimize churn. - Logical vs physical split (concepts/logical-vs-physical-sharding). The "serves as if sharded" state can be decoupled from the "data physically across multiple hosts" state so the riskier physical failover happens only after the application layer is trusted end-to-end.
Build vs buy¶
Existing horizontally-sharded SQL/NoSQL options: CockroachDB, TiDB, Spanner, Vitess (and the NoSQL family). Adopting any of these typically means a full cross-store data migration + rebuilding operational expertise on the new substrate. Figma's 2022 decision to build in-house on top of vertically-partitioned RDS Postgres was driven by aggressive runway pressure + deep operational expertise + a relational data model NoSQL couldn't express (Source: sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale). The explicit corollary: once initial runway is bought, the in-house choice is worth re-evaluating against mature NewSQL / managed options — the choice was shaped by the deadline, not by long-term preference.
Distinction from auto-sharding / shard management¶
Horizontal sharding at the SQL database level (this page) is about splitting a table's rows across DB instances. "Dynamic sharding" at the shard-manager level (e.g. systems/dicer) is about assigning stateful service pods to key ranges with a controller. Complementary axes: horizontal sharding answers where does the row go, dynamic sharding answers which pod owns the range now.
Seen in¶
- sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale — Figma's in-house horizontal sharding on RDS Postgres: vertical partitioning → horizontal sharding path, colos, hash-of-shard-key routing, logical-vs-physical decoupling via views, DBProxy query engine, shadow-readiness query-subset selection. First sharded table September 2023 with 10s partial primary availability / no replica impact.
Related¶
- concepts/vertical-partitioning
- concepts/shard-key
- concepts/scatter-gather-query
- concepts/logical-vs-physical-sharding
- concepts/static-sharding / concepts/dynamic-sharding (shard-management axis)
- patterns/sharded-views-over-unsharded-db
- patterns/colocation-sharding
- patterns/shadow-application-readiness
- systems/dbproxy-figma