CONCEPT Cited by 1 source
At-most-once delivery¶
At-most-once delivery is a messaging / RPC delivery semantic in which each message is delivered zero or one times, never more. If the network drops the message, or the destination process crashes before accepting it, the message is lost — no retry, no persistence, no duplicate. The counterpart is at-least-once (retries until an ack, with duplicates tolerated) and exactly-once (at-least-once + idempotency / dedup).
At-most-once is attractive for throughput + simplicity: no ack protocol, no retry state, no dedup needed. It is unsafe for any flow where message loss causes divergent state between producer and consumer.
Canonical failure instance at Fly.io¶
NATS core is at-most-once. Fly.io used NATS
in 2022 to push WireGuard peer configs from the GraphQL API to
the right regional gateway. NATS dropped some percentage of
these pushes. The result was divergent state: flyctl
received the config in its GraphQL response, believed the peer
was installed, opened the handshake — and the gateway had no
peer for it.
"NATS is fast, but doesn't guarantee delivery. ... Our NATS cluster was losing too many messages to host a reliable API on it." (Source: sources/2024-03-12-flyio-jit-wireguard-peers)
The eventual fix was architectural, not semantic: switch to patterns/pull-on-demand-replacing-push, which has no delivery-guarantee requirement because there is no async RPC in the critical path — the gateway fetches the config synchronously when it needs it.
Seen in¶
- sources/2024-03-12-flyio-jit-wireguard-peers — NATS as at-most-once push transport that dropped WireGuard peer configs.
Related¶
- systems/nats — canonical at-most-once substrate at Fly.io.
- concepts/thundering-herd — adjacent delivery pathology.
- patterns/pull-on-demand-replacing-push — the architecture that makes at-most-once-loss irrelevant.