Skip to content

PATTERN Cited by 1 source

Near-realtime replica via L0 polling

Pattern

A read-replica of an object-store-backed database stays approximately current by polling the writer's finest-grained compaction level (L0) for newly-uploaded change files and incrementally updating its local page index. No streamed-WAL channel, no local write journal, no replication protocol — just object-store reads on a polling cadence matched to L0 upload cadence.

The writer side runs hierarchical LTX compaction with an L0 level that uploads fine-grained change files at a fast cadence (1 file / second in the Litestream case), retained only until the next L1 compaction step consolidates them. The reader side only needs to know the object-store path prefix:

Writer (Litestream Unix program on primary)
    │  emits LTX(i) to s3://bucket/path/l0/ every 1s
Object storage (S3 / Tigris / GCS / Azure Blob)
    │  Reader polls s3://bucket/path/l0/ every N ms
    │  For each new LTX file:
    │    - Range GET its index trailer (~1% of file)
    │    - Merge new (page → byte-offset) entries into
    │      in-memory page index
Reader (Litestream VFS loaded in application)
    │  Reads always resolve against the most-recent page index

Canonical instance: Litestream VFS

From the 2025-12-11 shipping post:

"Quickly building an index and restore plan for the current state of a database is cool. But we can do one better. Because Litestream backs up (into the L0 layer) once per second, the VFS code can simply poll the S3 path, and then incrementally update its index. The result is a near-realtime replica. Better still, you don't need to stream the whole database back to your machine before you use it." (Source: sources/2025-12-11-flyio-litestream-vfs)

Litestream VFS realises this pattern: the 1-file-per-second L0 upload cadence bounds the theoretical freshness, and the reader's polling interval + incremental index merge bound the "near-realtime" claim.

Why the L0 level specifically

The L0 level is special in the Litestream compaction ladder:

  • Uploaded every second — seconds-grained freshness.
  • Retained only until L1 compaction — no long-term storage cost.
  • Small file size — each L0 file is the second's worth of page changes, compressed per-page, with a trailer.

Reading L0 directly gives the reader the "near-realtime" property without running hierarchical compaction on the reader side. The reader doesn't need to reconstruct the hierarchy — it just needs to keep its in-memory page index current against the finest-grained level, and rely on the writer's compaction to eventually promote the records upward.

Contrast with: stream-replication protocols

Traditional read-replica protocols stream the write log:

Primitive Replication
Postgres streaming replication WAL shipped over TCP
MySQL binlog replication binlog events streamed
Kafka Connect log-compacted topic consumers
MongoDB replica set oplog tailing
LiteFS node-to-node LTX streamed over HTTP between peers

All of these require a stateful connection from the replica to a primary / log source, with failover + reconnection semantics + backpressure etc.

L0 polling inverts this:

  • No connection to maintain; the reader never talks to the writer.
  • Object storage is the source of truth.
  • Failover of the writer is invisible to readers (they see new files from whoever holds the CASAAS lease).
  • A reader that disappears and reconnects needs no resume token — it re-reads L0 from wherever it stopped, or restarts cold.

Trade-offs

  • Polling cadence bounds freshness. Reader's poll interval is the read-your-writes lag ceiling (plus L0 upload cadence + L0 retention window until L1 compaction). 1-second L0 + 1-second poll = ~2-second p99 lag in the Litestream case; tighter if the reader can afford more polls, looser if it's frugal.
  • Every poll is an object-store operation. Small constant cost per poll, dominated by list-prefix + get-tail-of-new-files. For cost-sensitive deployments with many replicas, polls add up.
  • No push-based backpressure. If L0 uploads slow down (writer lagging), the reader keeps polling an empty or stale prefix; it can't tell from the reader side whether the writer is healthy or down.
  • Retention window concern for readers. If a reader is slower than L0→L1 compaction, an L0 file may vanish before the reader reads it. The reader then has to fetch the equivalent change range from L1 — different file, different page-to-offset mapping, more data per read. The reader needs to know how to fall back.
  • No guaranteed "full database present" moment. Unlike a streaming replica that downloads a base snapshot then follows the log, a reader here never fully downloads the database. Every query that hits cold pages pays the round-trip cost even while polling L0 for currency.

When it's the wrong shape

  • Strong read-after-write required. 1-second lag is fine for "near-realtime" dashboards but fatal for strict RYW (authenticate-immediately-after-registering-user-style).
  • Writer not emitting a seconds-grained log. Without an L0 equivalent, the pattern collapses to plain snapshot-download reads.
  • Object-store polling too expensive. At sufficient replica count, the list-prefix + get-object-tail cost exceeds streaming-replication cost. Hybrid: use polling only on a few followers, streaming for the rest.
  • High-freshness interactive writes by the same user. A dashboard showing your own writes benefits more from a local write-through cache than from a polling L0 replica.

Seen in

  • sources/2025-12-11-flyio-litestream-vfs — canonical wiki instance. Litestream VFS polls L0 (1 upload per second) and incrementally updates its in-memory page index. "The result is a near-realtime replica." The 2025-10-02 v0.5.0 post had already disclosed L0 as one of Litestream's compaction levels, but the L0-as-replica-polling-source framing ships here.
Last updated · 200 distilled / 1,178 read