Skip to content

CONCEPT Cited by 1 source

At-least-once uploads for cost reduction

In a 3-way-replicated stateful system that tiers data out to object storage, the naive approach is to have each replica upload its local blocks to object storage — giving a 3× upload cost + 3× duplicate blocks (that need to be deduplicated by compaction).

A cheaper posture: only 2 of the 3 replicas upload. This gives at-least-once delivery — as long as one of the uploading replicas succeeds, the block lands in object storage. The third replica's local copy is redundant storage insurance, not upload source.

Why this is still durable + consistent

  • In-memory + on-disk durability is still 3× replicated — quorum writes guarantee that an acknowledged sample exists on 2 of 3 replicas at the memory/disk tiers.
  • Object-storage durability is inherited from the cloud provider (S3 / GCS / Azure Blob) and is typically ≥11 9s — the redundancy bottleneck is the upload itself, not the storage.
  • Compaction deduplicates identical blocks uploaded from two replicas, so having two independent uploaders doesn't inflate the object store.
  • If one of the two uploaders is unavailable, the other still delivers — classic 2-out-of-3 quorum semantics applied to the upload path.

The third replica is effectively a cold-standby uploader — if one of the primary uploaders is removed from the cluster, it gets promoted and starts uploading. Average-case: 33% reduction in upload egress cost.

Why this is canonically cheap

At hyperscale, the object-storage upload path (egress bandwidth, PUT API calls, compaction I/O) is material compared to the object-storage at-rest cost. Trimming redundant uploads is a direct line-item savings.

Seen in

Last updated · 451 distilled / 1,324 read