SYSTEM Cited by 4 sources
HashiCorp Consul¶
Consul is HashiCorp's service-discovery + KV store. Uses Raft for consensus over the KV store and a SWIM-based gossip layer (Serf) for membership.
The wiki tracks Consul where it appears as a load-bearing substrate — or, more often at Fly.io, a rejected one.
Seen in¶
-
sources/2025-10-22-flyio-corrosion — Consul as the thing Fly.io originally built global routing on, and the thing Corrosion replaced. "For longer than we should have, we relied on HashiCorp Consul to route traffic. Consul is fantastic software. Don't build a global routing system on it. Then we built SQLite caches of Consul. SQLite: also fantastic. But don't do this either." Fly's Consul cluster "running on the biggest iron we could buy, wasted time guaranteeing consensus for updates that couldn't conflict in the first place" — the canonical anti-pattern for WAN Raft. Consul also appears as the perpetrator of an uplink-saturation outage: a Consul mTLS certificate expiry severed every worker's Consul connection; the backoff-loop retries on each worker re-invoked Machine-state code paths which wrote to Corrosion, saturating Fly's uplinks fleet-wide. Fly now runs Corrosion without Consul.
-
sources/2024-12-12-stripe-the-secret-life-of-dns-packets-investigating-complex-networks — Consul as the service-discovery substrate behind Stripe's DNS interface. Stripe's central Unbound DNS-resolver cluster has a forwarding rule that routes service-discovery domains to a Consul cluster; application code resolves service names via plain DNS. Different operational altitude from the Fly.io / Roblox instances — here Consul is the service-discovery backing store for a DNS-fronted naming abstraction, not the routing brain of a global fleet.
Related¶
- systems/corrosion-swim — Fly.io's Consul replacement.
- systems/swim-protocol — the specific gossip variant Consul uses (via Serf).
- systems/roblox-hashistack — Roblox's on-prem HashiStack deployment and the 73-hour Oct-2021 Consul-streaming-regression outage.
- systems/unbound — Stripe's DNS resolver front-end forwarding to Consul for service-discovery domains.
- concepts/no-distributed-consensus
- concepts/gossip-protocol
- concepts/consul-streaming-vs-longpoll
- companies/highscalability
- companies/stripe
Also seen in¶
- sources/2023-07-16-highscalability-gossip-protocol-explained — Consul named as a canonical SWIM (via Serf) deployment for "group membership, leader election, and failure detection of consul agents." Third-party explainer-level citation rather than a Consul-operator post; useful as the textbook-level pointer to Consul's gossip layer for readers coming to Consul from the gossip-protocol concept page.
- sources/2022-12-02-highscalability-stuff-the-internet-says-on-scalability-for-december-2nd-2022 — Summary of Roblox's Oct-2021 73-hour outage post-mortem. Roblox ran all backend services on one Consul cluster (18,000 servers, 170,000 containers). Enabling Consul's new streaming feature on the traffic-routing tier under high concurrent read+write load caused contention on a single Go channel, blocking KV writes for tens of seconds. Rolling streaming back fleetwide dropped Consul KV write P50 back to 300 ms. See concepts/consul-streaming-vs-longpoll for the detailed failure-mode analysis and systems/roblox-hashistack for the operational context.