PATTERN Cited by 1 source
Geospatial gossip sharding¶
Problem¶
A stateful dispatch service — Uber-style rider-driver matching, food-delivery dispatch, on-demand logistics — needs to:
- Shard state by geography so a rider's request can be matched against only nearby drivers, not globally.
- Keep each node's view of cluster membership + ownership consistent so requests route to the right shard.
- Survive stateful-node failures without losing the in-memory supply/location state.
A central coordinator (ZooKeeper/etcd) is an option, but becomes a single point of coupling, and round-trip membership lookups add latency to every dispatch decision.
Solution — the Uber 2014 stack shape¶
Compose three primitives:
- Geospatial index with cell IDs as shard key. Use S2 or H3 to partition the Earth's surface into labelable cells. Each dispatch node owns one or more cells. Rider requests route by cell ID.
- Gossip protocol for cluster membership + routing. Every node gossips periodically to a random subset of peers about who's alive and which cells they own. Information converges across the cluster in O(log₍fanout₎ N) rounds without a central coordinator.
- Stateful service instances — the dispatch nodes themselves hold driver-supply + ride-request state in memory, scoped to the cells they own. Gossip handles re-sharding when nodes join or leave.
Canonical instance — Uber 2014 dispatch rewrite¶
"We used Google's S2 library to segment cities into areas of consideration and used the S2 cell ID as the sharding key. (…) Since these services were still running on Node.js and were stateful, we needed a way to scale as the business grew. So we developed Ringpop, a gossip-protocol based approach to share geospatial and supply positioning for efficient matching."
— sources/2024-03-14-highscalability-brief-history-of-scaling-uber
The three-primitive stack for Uber:
- Cells: S2 (later H3, which Uber open-sourced as the hexagonal successor).
- Gossip + sharding library: Ringpop (Node.js).
- Dispatch nodes: stateful Node.js services running over Ringpop.
This stack drove Uber's city-scale rider-driver matching through the mid-2010s and is still a load-bearing part of the fulfillment substrate today (the 2021+ Fulfillment Platform rebuilt many pieces on Spanner, but the geospatial-shard shape remains relevant).
Why this works¶
- Locality of matching. A rider only needs to be matched against drivers nearby — most TSP-variant matching logic runs on a single node's local state.
- No central coordinator. Gossip means failure of any one node does not block progress elsewhere.
- Elastic. Adding/removing dispatch nodes re-balances cells organically via gossip.
- Low operational complexity per added node. Each node runs the same code; no leader election or quorum negotiation required.
When to use¶
- Stateful dispatch / matching over geographic data at city or regional scale.
- Need for low-latency matching on the order of tens of milliseconds.
- Willingness to accept eventual consistency in cluster-membership views (gossip converges in O(log N) rounds, not instantaneously).
When not to use¶
- Global (non-geographic) matching — a normal consistent hash ring is simpler.
- Strongly-consistent cross-cell invariants — Uber's later move to Spanner for the Fulfillment Platform is the explicit admission that some workloads need a different substrate.
Related¶
- systems/ringpop — Uber's gossip-sharding library.
- systems/s2-geometry / systems/h3-geo — the spatial indexing primitives.
- systems/uber-dispatch — the canonical deployment.
- concepts/geospatial-sharding — the cross-cutting concept.
- concepts/gossip-protocol — the membership + routing substrate.
- concepts/traveling-salesman-problem — the problem shape this pattern makes tractable at locality scale.
- companies/uber — origin org.