PATTERN Cited by 3 sources

LTX compaction (time-window merge of SQLite page runs)¶

Pattern¶

Represent a database's changes as sorted per-transaction page-range files (LTX), then periodically k-way-merge adjacent time windows into larger files that keep only the latest version of each page. The output file is the point-in-time state of the database at the end of the window, encoded as a sorted page run.

Operationally, this is the LSM-tree compaction idea specialized for SQLite:

 Fine-grained LTX files (recent transactions)
   ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
   │ tx1..│ │ tx3..│ │ tx7..│ │ tx9..│    (minutes of writes)
   └──────┘ └──────┘ └──────┘ └──────┘
       │       │        │        │
       └───────┴────────┴────────┘
                   │
                   ▼
              ┌─────────┐
              │ merged  │      (one file per hour; keeps only
              │ tx1..9  │       the latest page version)
              └─────────┘
                   │
                   ▼
              (daily window, etc.)

Why it works¶

Two properties of LTX make compaction cheap:

Sorted by page number. Two LTX files can be merged with a straight k-way-merge streaming algorithm (see patterns/streaming-k-way-merge). No sort step required.
Transaction-scoped. Merging two windows is safe because each input LTX file respects transaction boundaries — the merged file is equivalent to the database state at the end of the later window.

From Fly.io:

"This process of combining smaller time ranges into larger ones is called compaction. With it, we can replay a SQLite database to a specific point in time, with a minimal duplicate pages. This is similar to how an LSM tree works." (Source: sources/2025-05-20-flyio-litestream-revamped)

PITR cost structure¶

Restore target	Pre-LTX (shadow WAL)	Post-LTX (compaction)
Restore to "now"	snapshot + replay all WAL frames since	download latest compacted LTX
Restore to time `t` in the past	snapshot + replay all WAL frames to `t`	download snapshot LTX at `t` + few deltas
Effort scales with	WAL volume (frame count)	distinct pages touched (compacted size)

The second row is where the pattern pays off: if a single page was rewritten a thousand times in the window, the shadow-WAL approach replays a thousand records; the LTX-compaction approach carries one page in the compacted output.

Canonical instance: Litestream revamp (2025-05-20)¶

Fly.io's 2025-05-20 Litestream revamp is the canonical wiki instance. Pre-revamp, Litestream shipped raw SQLite WAL frames to S3; restore cost scaled with WAL volume. Post-revamp, Litestream ships LTX files to object storage and runs a compaction process against them, converging on LiteFS's internal format. See systems/litestream for the system-level narrative and patterns/sqlite-plus-litefs-plus-litestream for the full stack.

Second consequence: compaction is cheap enough that wildcard/directory replication becomes viable — replicating /data/*.db with hundreds or thousands of SQLite databases from a single Litestream process was "infeasible" on the shadow-WAL design because WAL polling didn't amortise across databases; LTX streams amortise trivially.

Shipping-level compaction ladder (v0.5.0, 2025-10-02)¶

The 2025-10-02 Litestream v0.5.0 shipping post (sources/2025-10-02-flyio-litestream-v050-is-here) gives the pattern its first concrete production instantiation — a three-level time-window compaction hierarchy:

Level	Window	Input
L1	30 s	raw LTX files from live replication
L2	5 min	10 × L1 files
L3	1 hour	12 × L2 files

From the post:

"at Level 1, we compact all the changes in a 30-second time window; at Level 2, all the Level 1 files in a 5-minute window; at Level 3, all the Level 2's over an hour. Net result: we can restore a SQLite database to any point in time, using only a dozen or so files on average. Litestream performs this compaction itself. It doesn't rely on SQLite to process the WAL file. Performance is limited only by I/O throughput."

Two invariants this ladder exposes:

Restore cost is bounded O(dozen) files — "a dozen or so files on average" — regardless of retention depth. Compare against shadow-WAL: O(WAL-frames-since-last-snapshot).
Compaction is a Litestream-side background job, not a SQLite-triggered checkpoint. Performance is IO-bound on object-storage bandwidth, not CPU-bound.

This makes the compaction ladder a specialisation of time-window compaction (LSM-style leveled compaction where level boundaries are time intervals rather than size thresholds) — the shape is reusable beyond SQLite/LTX for any append workload over a merge-compactable sorted-run format.

L0 level disclosure (v-post, 2025-12-11)¶

The 2025-12-11 Litestream VFS ship post adds an L0 level on top of the v0.5.0 L1/L2/L3 ladder and clarifies the snapshot cadence:

"By default, Litestream uses time intervals of 1 hour at the highest level, down to 30 seconds at level 1. L0 is a special level where files are uploaded every second, but are only retained until being compacted to L1."

And from the diagram caption:

"we're taking daily full snapshots"

The full ladder as of 2025-12-11:

Level	Cadence / Window	Retention
Snapshots	daily full	full retention
L3	1-hour windows	full retention
L2	5-minute windows	until compacted to L3
L1	30-second windows	until compacted to L2
L0	1-second uploads	until compacted to L1 (seconds)

L0 is "special" because its retention is ephemeral: L0 files are the seconds-granularity primary-emit stream, retained only until the next L1 compaction consolidates them. L0 is therefore not useful for long-term restore (the equivalent state is eventually promoted into L1+L2+L3 files), but it is useful as a fast-follower polling surface for VFS readers that want near-realtime freshness — the L0-polling pattern.

Design consequence: the ladder now has two different consumers:

Restore flow (cold recovery) consumes snapshots + L3 + L2 + L1 — stable, retention-bounded files; L0 is not in the restore critical path.
Near-realtime VFS read-replica flow consumes L0 directly via polling; L3/L2/L1 are the fallback path once L0 files age out.

Trade-offs¶

Storage cost during retention. A long PITR retention window keeps intermediate LTX files around; storage cost is roughly proportional to retention × write-rate × dedupability. Compaction reduces restore cost, not archive cost at full retention.
Compaction trigger policy. The post doesn't disclose Litestream's compaction trigger (size thresholds, time thresholds, or both) — a design knob borrowed from any LSM-compaction literature.
Read amplification during live restore. If a restore target falls between compaction boundaries, the restore process merges the most recent compacted state with un-compacted deltas — more bandwidth than the "exact-point" case.

When it's the wrong shape¶

Streaming-read workloads over WAL CDC. Downstream consumers that need the full stream of writes (e.g. CDC into Kafka) want frame-level detail — compaction discards intermediate states. Use a separate CDC path (at the WAL layer) rather than consuming from compacted LTX.
Tiny, append-mostly databases. If the database never overwrites pages, compaction saves nothing — the compacted form equals the concatenation. LTX still helps with transaction awareness, but not via compaction.

Seen in¶

sources/2025-05-20-flyio-litestream-revamped — canonical wiki instance; "This is similar to how an LSM tree works."
sources/2025-10-02-flyio-litestream-v050-is-here — shipping-level disclosure of the three-level 30s / 5m / 1h compaction ladder; restore-cost invariant "a dozen or so files on average".
sources/2025-12-11-flyio-litestream-vfs — L0-level disclosure. Adds L0 (1-file-per-second upload, retained only until L1 compaction) as a "special" level on top of the L1/L2/L3 ladder. Confirms daily full snapshots above L3. L0 becomes the polling target for the shipped Litestream VFS reader's near-realtime replica behaviour.

concepts/ltx-file-format — the file format this pattern operates on.
concepts/lsm-compaction — the parent algorithm class.
systems/litefs — the original LTX consumer.
systems/litestream — the post-revamp LTX compaction consumer.
systems/litestream-vfs — the L0-polling reader that turns the ladder's finest level into a near-realtime replica.
systems/sqlite — the substrate whose pages LTX encodes.
patterns/near-realtime-replica-via-l0-polling — the reader pattern that composes with the compaction ladder's L0 level.
patterns/sqlite-plus-litefs-plus-litestream — the three-layer stack.
patterns/streaming-k-way-merge — the algorithmic substrate compaction runs on.