CONCEPT Cited by 1 source
Availability multiplication of dependencies¶
Definition¶
A service synchronously depending on N downstream components has a
request availability upper-bounded by the product of each
component's individual availability. Two 99.9% dependencies on the
critical path cap the caller at 99.9% × 99.9% = 99.8%. Every
downstream hop that must succeed for the caller to return 2xx adds
one more multiplication and shaves another decimal.
The arithmetic is obvious; the architectural consequence is not: teams often compose services freely on the synchronous path, unaware that each additional "must-succeed" downstream call erodes their SLO.
Arithmetic¶
For independent failures (approximation — in practice failures cluster on shared infra, but the directional point stands):
| N deps @ 99.9% | Availability ceiling |
|---|---|
| 1 | 99.9% |
| 2 | 99.8% |
| 3 | 99.7% |
| 5 | 99.5% |
| 10 | 99.0% |
For mixed levels, multiply: A = A1 × A2 × ... × An.
The Zalando Payments framing¶
Zalando Payments's Order Store service originally performed two synchronous operations per REST call: write to DynamoDB and publish a change event to Nakadi (Zalando's Kafka-backed event bus). Both downstreams individually at 99.9% bounded the service at 99.8%.
"As the availability of a service is the product of the availabilities of its dependencies, the more dependencies a service has, the lesser is its own availability. Let's assume DynamoDB and the message bus have availabilities of 99.9% each. Thus, the maximum availability for the service is 99.9% × 99.9% = 99.8%." (Source: sources/2022-02-02-zalando-utilizing-amazon-dynamodb-and-aws-lambda-for-asynchronous-event-publication)
The fix: push Nakadi publication off the synchronous path using the transactional outbox pattern — the service's critical path depends only on DynamoDB, so its ceiling returns to 99.9%. Nakadi unavailability can still delay event delivery, but it no longer fails the client's write.
The four architectural responses¶
Once the product-ceiling problem is recognised, four moves shrink the dependency count on the critical path:
- Decouple via durable log / outbox. Write once to a local durable store; a separate relay reads and fans out to secondary sinks asynchronously. See patterns/transactional-outbox, patterns/dynamodb-streams-plus-lambda-outbox-relay.
- Cache the answer. If the downstream is a read — keep a recent answer, serve during outage. See concepts/cache-for-availability.
- Fail open where safe. Degraded-but-up beats down for non-critical paths (typically analytics, suggestions, personalisation).
- Shed the dependency entirely. Sometimes the right move is to ask "does this call belong on the hot path?" — the answer is often no.
Why the arithmetic understates reality¶
In practice the ceiling is worse than the product:
- Retry amplification. A downstream tempo spike causes upstream retries, which increase downstream load, which causes more retries.
- Shared failure domains. A single AZ outage takes out multiple "independent" dependencies at once.
- Tail-latency propagation. Even a healthy downstream at p99.9 = 1s blows the caller's p99 if it's on the hot path.
So 99.9% × 99.9% = 99.8% is a best case, not a typical case.
The inverse: every removed dependency buys reliability¶
Moving Nakadi off Order Store's critical path is a 0.1% availability gain on paper (99.8% → 99.9%). In practice it's much larger because Nakadi outages — even brief ones — no longer produce 5xx at the REST API. The write still succeeds; the event just arrives later via DynamoDB Streams + a Lambda relay.
Seen in¶
- sources/2022-02-02-zalando-utilizing-amazon-dynamodb-and-aws-lambda-for-asynchronous-event-publication — Zalando Payments's Order Store. The canonical worked example on the wiki: two 99.9% deps → 99.8% ceiling motivates the transactional-outbox redesign; removing Nakadi from the sync path restores 99.9%. The article opens with the arithmetic, which is itself unusual — most architecture posts skip the motivation and jump to the mechanism.
Related¶
- concepts/availability-dependency — the qualitative framing (who takes me down when they fail?); this page is the quantitative version.
- concepts/eventual-consistency — what you trade for removing a sync dependency.
- concepts/event-driven-architecture — the aggregate shape that avoids dependency multiplication by default.
- patterns/transactional-outbox — the canonical pattern for removing a "must succeed" event-publish dependency.
- patterns/dynamodb-streams-plus-lambda-outbox-relay — the concrete DynamoDB realisation.
- companies/zalando