Skip to content

PATTERN Cited by 2 sources

LTX compaction (time-window merge of SQLite page runs)

Pattern

Represent a database's changes as sorted per-transaction page-range files (LTX), then periodically k-way-merge adjacent time windows into larger files that keep only the latest version of each page. The output file is the point-in-time state of the database at the end of the window, encoded as a sorted page run.

Operationally, this is the LSM-tree compaction idea specialized for SQLite:

 Fine-grained LTX files (recent transactions)
   ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
   │ tx1..│ │ tx3..│ │ tx7..│ │ tx9..│    (minutes of writes)
   └──────┘ └──────┘ └──────┘ └──────┘
       │       │        │        │
       └───────┴────────┴────────┘
              ┌─────────┐
              │ merged  │      (one file per hour; keeps only
              │ tx1..9  │       the latest page version)
              └─────────┘
              (daily window, etc.)

Why it works

Two properties of LTX make compaction cheap:

  1. Sorted by page number. Two LTX files can be merged with a straight k-way-merge streaming algorithm (see patterns/streaming-k-way-merge). No sort step required.
  2. Transaction-scoped. Merging two windows is safe because each input LTX file respects transaction boundaries — the merged file is equivalent to the database state at the end of the later window.

From Fly.io:

"This process of combining smaller time ranges into larger ones is called compaction. With it, we can replay a SQLite database to a specific point in time, with a minimal duplicate pages. This is similar to how an LSM tree works." (Source: sources/2025-05-20-flyio-litestream-revamped)

PITR cost structure

Restore target Pre-LTX (shadow WAL) Post-LTX (compaction)
Restore to "now" snapshot + replay all WAL frames since download latest compacted LTX
Restore to time t in the past snapshot + replay all WAL frames to t download snapshot LTX at t + few deltas
Effort scales with WAL volume (frame count) distinct pages touched (compacted size)

The second row is where the pattern pays off: if a single page was rewritten a thousand times in the window, the shadow-WAL approach replays a thousand records; the LTX-compaction approach carries one page in the compacted output.

Canonical instance: Litestream revamp (2025-05-20)

Fly.io's 2025-05-20 Litestream revamp is the canonical wiki instance. Pre-revamp, Litestream shipped raw SQLite WAL frames to S3; restore cost scaled with WAL volume. Post-revamp, Litestream ships LTX files to object storage and runs a compaction process against them, converging on LiteFS's internal format. See systems/litestream for the system-level narrative and patterns/sqlite-plus-litefs-plus-litestream for the full stack.

Second consequence: compaction is cheap enough that wildcard/directory replication becomes viable — replicating /data/*.db with hundreds or thousands of SQLite databases from a single Litestream process was "infeasible" on the shadow-WAL design because WAL polling didn't amortise across databases; LTX streams amortise trivially.

Shipping-level compaction ladder (v0.5.0, 2025-10-02)

The 2025-10-02 Litestream v0.5.0 shipping post (sources/2025-10-02-flyio-litestream-v050-is-here) gives the pattern its first concrete production instantiation — a three-level time-window compaction hierarchy:

Level Window Input
L1 30 s raw LTX files from live replication
L2 5 min 10 × L1 files
L3 1 hour 12 × L2 files

From the post:

"at Level 1, we compact all the changes in a 30-second time window; at Level 2, all the Level 1 files in a 5-minute window; at Level 3, all the Level 2's over an hour. Net result: we can restore a SQLite database to any point in time, using only a dozen or so files on average. Litestream performs this compaction itself. It doesn't rely on SQLite to process the WAL file. Performance is limited only by I/O throughput."

Two invariants this ladder exposes:

  • Restore cost is bounded O(dozen) files"a dozen or so files on average" — regardless of retention depth. Compare against shadow-WAL: O(WAL-frames-since-last-snapshot).
  • Compaction is a Litestream-side background job, not a SQLite-triggered checkpoint. Performance is IO-bound on object-storage bandwidth, not CPU-bound.

This makes the compaction ladder a specialisation of time-window compaction (LSM-style leveled compaction where level boundaries are time intervals rather than size thresholds) — the shape is reusable beyond SQLite/LTX for any append workload over a merge-compactable sorted-run format.

Trade-offs

  • Storage cost during retention. A long PITR retention window keeps intermediate LTX files around; storage cost is roughly proportional to retention × write-rate × dedupability. Compaction reduces restore cost, not archive cost at full retention.
  • Compaction trigger policy. The post doesn't disclose Litestream's compaction trigger (size thresholds, time thresholds, or both) — a design knob borrowed from any LSM-compaction literature.
  • Read amplification during live restore. If a restore target falls between compaction boundaries, the restore process merges the most recent compacted state with un-compacted deltas — more bandwidth than the "exact-point" case.

When it's the wrong shape

  • Streaming-read workloads over WAL CDC. Downstream consumers that need the full stream of writes (e.g. CDC into Kafka) want frame-level detail — compaction discards intermediate states. Use a separate CDC path (at the WAL layer) rather than consuming from compacted LTX.
  • Tiny, append-mostly databases. If the database never overwrites pages, compaction saves nothing — the compacted form equals the concatenation. LTX still helps with transaction awareness, but not via compaction.

Seen in

Last updated · 200 distilled / 1,178 read