Skip to content

PATTERN Cited by 3 sources

SQLite + LiteFS + Litestream

Pattern

Use SQLite as the primary storage engine, LiteFS as the distributed primary/replica filesystem layer (subsecond replication + primary failover), and Litestream for point-in-time recovery (PITR) to object storage. Both LiteFS and Litestream work with unmodified SQLite libraries — the application doesn't know it's distributed.

The resulting stack:

  App (uses plain SQLite driver)
        |
    SQLite file
        |
    LiteFS FUSE  ───► subsecond replicas (other regions)
                       + primary failover
        |
    Local storage
        |
    Litestream  ───► Object storage (S3-compatible)
                       (PITR, cold recovery)

Why it works for small hot data

From Fly.io:

"We've been running Macaroons for a couple years now, and the entire tkdb database is just a couple dozen megs large. Most of that data isn't real. A full PITR recovery of the database takes just seconds. We use SQLite for a lot of our infrastructure, and this is one of the very few well-behaved databases we have." (Source: sources/2025-03-27-flyio-operationalizing-macaroons.)

Preconditions (what makes a workload well-behaved for this stack):

  • Small steady-state size (MB-to-GB, not TB).
  • Low write rate — SQLite's single-writer model is the hard ceiling.
  • Needs geographic replicas (LiteFS) and recoverable history (Litestream) — a single-node SQLite instance is much simpler if you don't.
  • Application doesn't need features like full-text search at huge scale, vector search, etc.

Canonical instance: tkdb

  • Size: "a couple dozen megs" (most of it test/noise).
  • Write paths: only two — HMAC root-key insert on org creation, revocation-list appends.
  • PITR recovery time: "just seconds".
  • Regions: US (primary) + EU + AU (replicas); failover-able primary.
  • Encryption: records encrypted with injected secret before write.

Counter-instance (same company)

Fly.io's earlier corrosion infrastructure-SQLite project "routinely ballooned to tens of gigabytes and occasionally threatened service outages" — named implicitly in the closing paragraph of the tkdb post as the foil. The tkdb endorsement is explicitly "earned, not assumed": "lovely to see" a well-behaved infra-SQLite database after a not-well-behaved one.

When it's wrong

  • Write-heavy workloads — SQLite's single-writer model bottlenecks.
  • Large datasets — LiteFS replication cost scales with WAL; PITR recovery scales with snapshot+WAL size.
  • Workloads that need cross-region multi-writer concurrency — this is not that stack.

2025-05-20 revamp: architectural convergence on LTX

The 2025-05-20 Litestream redesign (see sources/2025-05-20-flyio-litestream-revamped) makes this three-layer stack architecturally cleaner — the three layers now share their internal wire format:

  • LiteFS already used LTX for node-to-node replication.
  • Litestream (post-revamp) uses LTX for node-to-object-store replication — retiring the old shadow WAL raw-frame shipping.
  • SQLite is integrated via either FUSE (LiteFS) or VFS (the new Litestream read-replica layer; LiteFS has LiteVFS).

Three second-order consequences for the stack:

  1. Restore cost drops. patterns/ltx-compaction makes PITR cost proportional to distinct pages touched, not raw WAL volume — restoring a long-retention target no longer replays every historical write.
  2. Single-leader enforcement no longer needs Consul. LiteFS's original Consul dependency for primary election is still available, but Litestream-only deployments now use CASAAS — object-store conditional writes as the coordination substrate.
  3. Wildcard / directory replication becomes viable. LTX's per-database cheapness lets one Litestream process replicate /data/*.db across hundreds or thousands of SQLite databases — previously infeasible.

The pattern's workload preconditions are unchanged — SQLite is still single-writer, still MB-to-GB, still not for write-heavy TB workloads. What changes is the storage-layer cost structure within those preconditions.

Seen in

  • sources/2025-03-27-flyio-operationalizing-macaroonstkdb is the canonical wiki instance.
  • sources/2025-05-20-flyio-litestream-revamped — the architectural-redesign entry; LTX + CASAAS + VFS revamp of Litestream converges the stack on a shared on-wire format and unlocks cheap PITR + wildcard replication. Forward-looking (no production numbers yet), but canonical disclosure of the new shape.
  • sources/2025-10-02-flyio-litestream-v050-is-hereshipping entry for the Litestream leg. v0.5.0 delivers LTX on-the-wire, a three-level hierarchical compaction ladder (30s / 5m / 1h — restore bounded to "a dozen or so files on average"), monotonic TXIDs replacing generations, per-page compression + EOF index in the LTX library (random-access precondition for VFS read replicas), and CGO removal via modernc.org/sqlite (Go cross-compile Just Works). Read-replica layer still proof-of-concept. The stack's workload preconditions are unchanged — SQLite is still single-writer, still MB-to-GB — but the Litestream leg is now substantially more efficient per byte of retained history.
Last updated · 200 distilled / 1,178 read