CONCEPT Cited by 1 source

Geographic sharding¶

Geographic sharding partitions data by a location dimension — usually the user's destination country / region / continent — so that each shard contains only the subset of the system relevant to serving requests for that region. The shard key is the query dimension, not a tenant ID.

Why use it¶

Locality-dominated workloads often have the property that the relevant subgraph for a given request is a small slice of the total data:

Print order routing: only suppliers that can ship to the customer's country matter.
Logistics / last-mile: only lanes, warehouses, and couriers in the destination region are candidates.
CDN / edge caches: only content popular in a region is worth holding locally.
Regulatory residency: some data legally can't leave the region.

Geographic sharding turns a "scan/traverse the global system" problem into a "scan/traverse one regional slice" problem.

Trade-offs¶

Smaller working set per request. Faster retrieval (Canva: ~6 ms per-region Redis retrieval vs. one-big-graph latency that would be much worse), faster traversal (smaller V, E).
Uneven shard sizes. Some regions are order-of-magnitude bigger than others. Canva reports up to ~20 ms retrieval on its largest shards vs. ~6 ms elsewhere — the hot-shard problem.
Cross-region queries become hard. If a request genuinely needs multiple regional slices, you're either scatter-gathering or falling back to a bigger / global view.
Rebalancing on boundary changes. Adding a new country or redrawing regions requires re-sharding the data.

Precompute + shard + cache¶

Often combined with CQRS-style async projection: the source of truth stays unsharded (or sharded differently), and a derived, per-region read model is precomputed and cached. Canva's routing is exactly this shape — relational source of truth rebuilt into per-region graphs pushed to ElastiCache / Redis. See patterns/async-projected-read-model.

Seen in¶

sources/2024-12-10-canva-routing-print-orders — shard key = order destination region; per-region graph in Redis; tuned for fast retrieval + traversal on the query's actual scope.

concepts/central-first-sharded-architecture — adjacent idea (global coordinator + per-region shards for residency + scale)
patterns/async-projected-read-model — how Canva builds the per-region shards asynchronously from a single source of truth
concepts/min-cost-flow — sharding is the enabler that keeps V·E traversal tractable

Geographic sharding¶

Why use it¶

Trade-offs¶

Precompute + shard + cache¶

Seen in¶

Related¶