Skip to content

CONCEPT Cited by 1 source

Availability Dependency

An availability dependency exists when service A cannot serve its SLO without service B being available. A's availability is upper-bounded by B's: if B is down, A is down.

The inverse of blast radius — blast radius asks "who do I take down when I fail?", availability dependency asks "who takes me down when they fail?".

Why it matters

Every synchronous network call to another service adds an availability dependency. A service with N such dependencies, each at 99.9 % uptime, has an upper-bound uptime of (0.999)^N — which drops fast:

N deps Upper-bound uptime
1 99.9 %
5 99.5 %
10 99.0 %
50 95.1 %

In practice, dependency failures cluster (shared data centers, shared network, shared deploys) so the numbers are approximations — but the directional point holds: more dependencies = lower ceiling.

Architectural responses

Teams that take this seriously use four moves to break hard dependencies into soft ones:

  1. Cache — hold a recent answer; serve during downstream outage. See patterns/cached-lookup-with-short-ttl.
  2. CDN front — let an edge cache serve previously-computed answers during origin outage. See patterns/cdn-in-front-for-availability-fallback.
  3. Retry across replicas — don't depend on one instance; try another on error.
  4. Fail open where safe — degraded-but-up beats down for non-critical paths.

The aggregate pattern is called cache for availability — the cache isn't primarily about latency or bandwidth, it's about decoupling your uptime from your dependency's uptime.

Seen in

  • sources/2025-09-02-github-rearchitecting-github-pagesGitHub Pages's 2015 rewrite moves the hostname-to-backend routing decision from a static nginx map to a per-request MySQL read-replica query. GitHub calls this out explicitly as a trade: "this introduces an availability dependency on MySQL. This means that if our MySQL cluster is down, so is GitHub Pages." The mitigations are four separate moves — retries to different replicas, a 30 s shared-memory cache, targeting replicas (not the master), and Fastly fronting all 200 responses — each addressing a different failure mode of the new dependency.
Last updated · 517 distilled / 1,221 read