Skip to content

CONCEPT Cited by 1 source

At-least-once delivery

Definition

At-least-once delivery is the messaging guarantee that each message will eventually be delivered to a consumer at least once, with no upper bound on how many duplicates the consumer may observe. The system prefers repeating a message to losing one. Most managed message buses — Kafka, SQS, DynamoDB Streams → Lambda, Nakadi, SNS fan-out — pick this as their default guarantee because it's achievable with simple broker logic and survives broker crashes, network partitions, and consumer failures.

The three classic guarantees

Guarantee Delivery Loss risk Duplicate risk
At-most-once ≤ 1 copy yes (on crash) no
At-least-once ≥ 1 copy no yes
Exactly-once exactly 1 copy no no

Exactly-once is expensive — it requires transactional producers and consumers, coordinated commits, and deduplication state. Most production systems choose at-least-once and push idempotency to the consumer.

Where duplicates come from

Even a well-behaved broker will produce duplicates at the consumer in any of these scenarios:

  • Consumer crashes after processing but before acknowledging. Broker redelivers on restart.
  • Acknowledgement is lost in transit. Network blip between consumer and broker.
  • Producer retries after timeout. Producer got no response, re-sent the message; broker accepted both copies.
  • DLQ requeue. A dead-letter handler drains the DLQ and re-publishes; consumers see the event twice (once for the original retry chain, once from the requeue).
  • Partition rebalance mid-flight. The new owner starts from the last committed offset, which may be behind the last processed.

The consumer-side mandate: idempotency

Under at-least-once, every consumer must tolerate seeing a message more than once without side effects. Common idempotency patterns:

  • Dedup by message ID — track seen IDs in a bounded-time key- value store; drop duplicates.
  • Idempotent writes — model the downstream write as "set state to X" not "increment by 1"; replays converge.
  • Upserts with logical timestamps — last-write-wins by client- supplied monotonic clock.
  • Write + compare — compute the effect; compare to current state; only apply the delta.

Which one fits depends on what the consumer does. Event-sourcing consumers naturally do the last one; cache-update consumers fit the idempotent-write form; notification senders use dedup-by-ID.

The Zalando framing

Zalando's Lambda relay publishing events to Nakadi is explicitly at-least-once: on transient Nakadi failure, the Lambda retries with exponential backoff; on exhaustion, the event goes to an SQS DLQ; a Kubernetes CronJob requeues from DLQ until Nakadi accepts. The post acknowledges both the ordering and duplication consequences:

"In case the publication to Nakadi fails, e.g. due to timeouts, the request is retried. If all the retries fail then we make use of an AWS SQS queue as fallback storage... This also means that we do not guarantee that the events are published in the correct order." (Source: sources/2022-02-02-zalando-utilizing-amazon-dynamodb-and-aws-lambda-for-asynchronous-event-publication)

Consumers of those Nakadi events must handle both replays and reordering.

Ordering vs delivery

At-least-once and per-key ordering are orthogonal:

  • Some systems preserve per-key order on redelivery (Kafka by partition, DynamoDB Streams by item hash key).
  • Some don't (any system with DLQ requeue or parallel retry workers — as in Zalando's design).

If the consumer needs stable order, it has to either: (a) pick a substrate that preserves order and accept serial processing per key; or (b) reconstruct order via embedded sequence numbers / logical timestamps in the payload.

Seen in

Last updated · 550 distilled / 1,221 read