CONCEPT Cited by 1 source

Availability multiplication of dependencies¶

Definition¶

A service synchronously depending on N downstream components has a request availability upper-bounded by the product of each component's individual availability. Two 99.9% dependencies on the critical path cap the caller at 99.9% × 99.9% = 99.8%. Every downstream hop that must succeed for the caller to return 2xx adds one more multiplication and shaves another decimal.

The arithmetic is obvious; the architectural consequence is not: teams often compose services freely on the synchronous path, unaware that each additional "must-succeed" downstream call erodes their SLO.

Arithmetic¶

For independent failures (approximation — in practice failures cluster on shared infra, but the directional point stands):

N deps @ 99.9%	Availability ceiling
1	99.9%
2	99.8%
3	99.7%
5	99.5%
10	99.0%

For mixed levels, multiply: A = A1 × A2 × ... × An.

The Zalando Payments framing¶

Zalando Payments's Order Store service originally performed two synchronous operations per REST call: write to DynamoDB and publish a change event to Nakadi (Zalando's Kafka-backed event bus). Both downstreams individually at 99.9% bounded the service at 99.8%.

"As the availability of a service is the product of the availabilities of its dependencies, the more dependencies a service has, the lesser is its own availability. Let's assume DynamoDB and the message bus have availabilities of 99.9% each. Thus, the maximum availability for the service is 99.9% × 99.9% = 99.8%." (Source: sources/2022-02-02-zalando-utilizing-amazon-dynamodb-and-aws-lambda-for-asynchronous-event-publication)

The fix: push Nakadi publication off the synchronous path using the transactional outbox pattern — the service's critical path depends only on DynamoDB, so its ceiling returns to 99.9%. Nakadi unavailability can still delay event delivery, but it no longer fails the client's write.

The four architectural responses¶

Once the product-ceiling problem is recognised, four moves shrink the dependency count on the critical path:

Decouple via durable log / outbox. Write once to a local durable store; a separate relay reads and fans out to secondary sinks asynchronously. See patterns/transactional-outbox, patterns/dynamodb-streams-plus-lambda-outbox-relay.
Cache the answer. If the downstream is a read — keep a recent answer, serve during outage. See concepts/cache-for-availability.
Fail open where safe. Degraded-but-up beats down for non-critical paths (typically analytics, suggestions, personalisation).
Shed the dependency entirely. Sometimes the right move is to ask "does this call belong on the hot path?" — the answer is often no.

Why the arithmetic understates reality¶

In practice the ceiling is worse than the product:

Retry amplification. A downstream tempo spike causes upstream retries, which increase downstream load, which causes more retries.
Shared failure domains. A single AZ outage takes out multiple "independent" dependencies at once.
Tail-latency propagation. Even a healthy downstream at p99.9 = 1s blows the caller's p99 if it's on the hot path.

So 99.9% × 99.9% = 99.8% is a best case, not a typical case.

The inverse: every removed dependency buys reliability¶

Moving Nakadi off Order Store's critical path is a 0.1% availability gain on paper (99.8% → 99.9%). In practice it's much larger because Nakadi outages — even brief ones — no longer produce 5xx at the REST API. The write still succeeds; the event just arrives later via DynamoDB Streams + a Lambda relay.

Seen in¶

sources/2022-02-02-zalando-utilizing-amazon-dynamodb-and-aws-lambda-for-asynchronous-event-publication — Zalando Payments's Order Store. The canonical worked example on the wiki: two 99.9% deps → 99.8% ceiling motivates the transactional-outbox redesign; removing Nakadi from the sync path restores 99.9%. The article opens with the arithmetic, which is itself unusual — most architecture posts skip the motivation and jump to the mechanism.

concepts/availability-dependency — the qualitative framing (who takes me down when they fail?); this page is the quantitative version.
concepts/eventual-consistency — what you trade for removing a sync dependency.
concepts/event-driven-architecture — the aggregate shape that avoids dependency multiplication by default.
patterns/transactional-outbox — the canonical pattern for removing a "must succeed" event-publish dependency.
patterns/dynamodb-streams-plus-lambda-outbox-relay — the concrete DynamoDB realisation.
companies/zalando