PATTERN Cited by 1 source
Conditional-write lease (CASAAS — single-writer via object-store CAS)¶
Pattern¶
Implement a time-based lease — a mutex with expiration — on
top of an object store's conditional
write primitive. A single writer holds a lease file in the
bucket; renewal succeeds only via If-Match: <current-etag>
(compare-and-set). Expired leases can be re-acquired by any
other writer via the same atomic check. The object store itself
becomes the coordination substrate — no external Consul / etcd /
Zookeeper dependency.
Ben Johnson coined the name in the 2025-05-20 Litestream redesign post:
"Modern object stores like S3 and Tigris solve this problem for us: they now offer conditional write support. With conditional writes, we can implement a time-based lease. We get essentially the same constraint Consul gave us, but without having to think about it or set up a dependency."
The backronym from the post: CASAAS — Compare-and-Swap as a Service.
Mechanism (high level)¶
lease object: s3://bucket/.lease
contents: { "owner": "host-123", "exp": <unix-time> }
Writer W1 tries to acquire lease:
GET /.lease → (object, etag=E0)
if expired: PUT /.lease with body {"owner":"W1", "exp":now+T},
header "If-Match: E0"
else: back off
PUT returns:
200 OK → W1 is now the leaseholder
412 Precondition Failed
→ someone else advanced it; re-read and retry
Renewal (W1 periodically):
PUT /.lease with new exp, "If-Match: <our-current-etag>"
fail → W1 is no longer the leaseholder (crashed/partitioned)
Because the conditional write is atomic at the storage layer, exactly one writer wins any contested acquisition. The object store enforces mutual exclusion without a separate coordination service.
Prerequisites¶
- Strong consistency for reads of the lease object. The precondition check must see the current state, not a stale replica. S3 has provided this since 2020; see concepts/strong-consistency.
- Conditional writes (If-Match / If-None-Match) at the object API. S3 rolled these out in November 2024 (general-purpose buckets); Tigris ships the same semantics.
Both must be present; pre-2024 S3 + pre-strong-consistency = no safe lease protocol.
Why it matters for Litestream¶
The pre-revamp Litestream had a dedicated abstraction — "generations" — to handle desynchronisation when the replication stream was interrupted (crash, new-server start). A generation uniquely identified a snapshot + WAL-stream pair; any break in the WAL sequence created a new generation, and the replica had to know about all of them.
Two consequences of the generation abstraction were load-bearing pain points:
- Read-replica and failover support were complicated by having to choose which generation to follow.
- Multiple writers pointed at the same destination (if deployed carelessly — e.g. rolling deploys) would create interleaved garbage-generation streams.
CASAAS collapses this: with a single-writer constraint enforced by the lease, only one generation ever exists at a given destination. The generations abstraction can be retired.
Additional feature unlock: "you can run Litestream with ephemeral nodes, with overlapping run times, and even if they're storing to the same destination, they won't confuse each other." Rolling deploys, Fly-Machine restarts, and blue/green cutovers are all naturally safe.
Contrasts with LiteFS's approach¶
LiteFS originally used Consul (via FlyConsul) for primary-election / single-leader enforcement. This worked but required users to know about and configure Consul — a large usability tax, cited explicitly in the redesign post as "part of the reason Litestream is so much more popular than LiteFS." CASAAS removes the external dependency by reusing the object store that Litestream already needs.
Trade-offs¶
- Clock skew matters. A time-based lease assumes bounded clock skew between the would-be leaseholder and the object store (or among writers). Large skew can let two writers think they hold the lease simultaneously in narrow windows. Production settings typically use NTP-synchronised hosts and lease durations significantly larger than worst-case skew.
- Lease duration is a throughput-vs-failover knob. Short leases = faster failover if the leaseholder dies; more PUTs per second on the lease object (billable ops). Long leases = lower operations cost; longer unavailability if the leaseholder crashes.
- Object-store rate limits are the ceiling. The lease object is a hot key. High-frequency renewal could hit per-key rate limits on S3 or Tigris — mitigated by modest lease TTLs (seconds-to-tens-of-seconds) rather than sub-second.
- Storage cost is negligible. A few bytes of lease metadata, overwritten at TTL cadence.
Generalisations¶
- Leader election for any service that already uses object storage. Same shape: one blob, conditional PUT, TTL.
- Distributed-lock service replacement. For workloads that only need single-writer semantics per resource (not Paxos/Raft strong consensus), CASAAS eliminates the coordination service entirely.
- Precedent: Iceberg / Delta / Hudi snapshot-pointer commits use the same shape to serialize catalog updates — conditional PUT of a pointer file to order commits. See patterns/conditional-write for the broader family.
When this pattern fails¶
- Writers not colocated with the object-store region — high RTT inflates renewal latency and raises clock-skew exposure.
- Need Paxos-class consensus, not just single-writer. CASAAS gives mutual exclusion; it does not give durable multi-decision consensus. Use a proper consensus service for that.
- Object store doesn't support conditional writes, or doesn't guarantee strong consistency on the lease read-path. Pre-2024 S3, most pre-2020 object stores, and many eventually-consistent stores all fail this.
Seen in¶
- sources/2025-05-20-flyio-litestream-revamped — canonical wiki instance; Ben Johnson's "CASAAS" framing; replaces Consul-based leader election from LiteFS.
Related¶
- patterns/conditional-write — the parent pattern (CAS at the storage layer).
- systems/litestream — canonical consumer.
- systems/litefs — the sibling system that used Consul pre-this-redesign.
- systems/aws-s3 — conditional-write enabler (2024-11).
- systems/tigris — second object store named as supporting this primitive.
- concepts/strong-consistency — the read-side prerequisite.
- concepts/immutable-object-storage — the surrounding API contract.