CONCEPT Cited by 1 source
Geographic sharding¶
Geographic sharding partitions data by a location dimension — usually the user's destination country / region / continent — so that each shard contains only the subset of the system relevant to serving requests for that region. The shard key is the query dimension, not a tenant ID.
Why use it¶
Locality-dominated workloads often have the property that the relevant subgraph for a given request is a small slice of the total data:
- Print order routing: only suppliers that can ship to the customer's country matter.
- Logistics / last-mile: only lanes, warehouses, and couriers in the destination region are candidates.
- CDN / edge caches: only content popular in a region is worth holding locally.
- Regulatory residency: some data legally can't leave the region.
Geographic sharding turns a "scan/traverse the global system" problem into a "scan/traverse one regional slice" problem.
Trade-offs¶
- Smaller working set per request. Faster retrieval (Canva: ~6 ms per-region Redis retrieval vs. one-big-graph latency that would be much worse), faster traversal (smaller V, E).
- Uneven shard sizes. Some regions are order-of-magnitude bigger than others. Canva reports up to ~20 ms retrieval on its largest shards vs. ~6 ms elsewhere — the hot-shard problem.
- Cross-region queries become hard. If a request genuinely needs multiple regional slices, you're either scatter-gathering or falling back to a bigger / global view.
- Rebalancing on boundary changes. Adding a new country or redrawing regions requires re-sharding the data.
Precompute + shard + cache¶
Often combined with CQRS-style async projection: the source of truth stays unsharded (or sharded differently), and a derived, per-region read model is precomputed and cached. Canva's routing is exactly this shape — relational source of truth rebuilt into per-region graphs pushed to ElastiCache / Redis. See patterns/async-projected-read-model.
Seen in¶
- sources/2024-12-10-canva-routing-print-orders — shard key = order destination region; per-region graph in Redis; tuned for fast retrieval + traversal on the query's actual scope.
Related¶
- concepts/central-first-sharded-architecture — adjacent idea (global coordinator + per-region shards for residency + scale)
- patterns/async-projected-read-model — how Canva builds the per-region shards asynchronously from a single source of truth
- concepts/min-cost-flow — sharding is the enabler that keeps V·E traversal tractable