Skip to content

SYSTEM Cited by 3 sources

Litestream VFS

Litestream VFS is a SQLite Virtual Filesystem extension that lets an unmodified SQLite library read a database "hot off an object storage URL" — individual page reads resolve to HTTP Range GETs against LTX files in S3-compatible storage, fronted by an in-process LRU cache of hot pages. Shipped in 2025-12-11 after being teased as proof-of-concept in the [[sources/2025-05-20-flyio-litestream-revamped|2025-05-20 design post]] and explicitly flagged as "not yet shipped" in the [[sources/2025-10-02-flyio-litestream-v050-is-here|2025-10-02 v0.5.0 shipping post]].

Activation surface

Standard SQLite loadable-extension mechanism:

$ sqlite3
SQLite version 3.50.4 ...
sqlite> .load litestream.so
sqlite> .open file:///my.db?vfs=litestream

The vfs=litestream URI parameter tells SQLite to route OS-level I/O for this connection through the Litestream VFS extension rather than the default OS VFS. From the application's point of view, it is still stock SQLite.

"In particular: Litestream VFS doesn't replace the SQLite library you're already using. It's not a new 'version' of SQLite. It's just a plugin for the SQLite you're already using." (Source: sources/2025-12-11-flyio-litestream-vfs)

Read-side-only

From the post:

"We override only the few methods we care about. Litestream VFS handles only the read side of SQLite. Litestream itself, running as a normal Unix program, still handles the 'write' side. So our VFS subclasses just enough to find LTX backups and issue queries."

Writes still flow through the regular Litestream primary (the Unix program running alongside the application, tailing the WAL and emitting LTX files). The VFS is an additive read-side capability, not a replacement for the write path.

Page lookup via LTX index trailer

SQLite's Read() call carries a byte offset computed against the "local file" illusion. The VFS discards that offset and uses the page number + page size to look up the remote LTX file containing the page + the real byte offset within that file. The lookup table is built from the LTX end-of-file index trailers — each LTX file's trailer is ~1% of the file size, so only a small fraction of remote bytes need to be fetched to build a database- wide page index.

"LTX trailers include a small index tracking the offset of each page in the file. By fetching only these index trailers from the LTX files we're working with (each occupies about 1% of its LTX file), we can build a lookup table of every page in the database."

Range GET against object storage

Once the page's (filename, byte offset, size) is known, the VFS issues an HTTP Range GET against the object store (S3 / Tigris / GCS / Azure Blob — any provider that honors the Range request header):

"That's enough for us to use the S3 API's Range header handling to download exactly the block we want."

Canonical instance of patterns/vfs-range-get-from-object-store.

LRU cache exploits SQLite B-tree hot-set

To avoid round-tripping to S3 on every page read, the VFS keeps an in-process LRU cache of recently-read pages. The post names the SQLite-specific hot-set shape:

"To save lots of S3 calls, Litestream VFS implements an LRU cache. Most databases have a small set of 'hot' pages — inner branch pages or the leftmost leaf pages for tables with an auto-incrementing ID field. So only a small percentage of the database is updated and queried regularly."

This is a sibling observation to the 2025-10-02 "sandwiches" worked example — AUTOINCREMENT primary keys mean every insert hits the same rightmost leaf, and queries walk the same inner branch pages to find rows; the resulting hot set is tiny and high-LRU-value.

Near-realtime replica via L0 polling

Because LTX compaction's L0 level uploads one file per second (kept only until the next L1 compaction), the VFS can poll the object-store path for new L0 files and incrementally update its page-lookup table:

"Because Litestream backs up (into the L0 layer) once per second, the VFS code can simply poll the S3 path, and then incrementally update its index. The result is a near-realtime replica. Better still, you don't need to stream the whole database back to your machine before you use it."

Canonical instance of patterns/near-realtime-replica-via-l0-polling.

SQL-level PITR via PRAGMA litestream_time

Point-in-time recovery is now a query-time knob, not a CLI operation:

sqlite> PRAGMA litestream_time = '5 minutes ago';
sqlite> SELECT * FROM sandwich_ratings ORDER BY RANDOM() LIMIT 3;
30|Meatball|Los Angeles|5
33|Ham & Swiss|Los Angeles|2
163|Chicken Shawarma Wrap|Detroit|5

Accepts relative timestamps ("5 minutes ago") or absolute ones (2000-01-01T00:00:00Z). Subsequent reads on that connection resolve against the LTX state at the chosen timestamp. Canonical instance of concepts/pragma-based-pitr.

Worked disaster-recovery example from the post: somebody runs UPDATE sandwich_ratings SET stars = 1 in prod (missing WHERE clause); on a dev machine the operator sets PRAGMA litestream_time = '5 minutes ago' and queries the table at its pre-disaster state. "Updating your database state to where it was an hour (or day, or week) ago is just a matter of adjusting the LTX indices Litestream manages."

Fast startup for ephemeral servers

"It starts up really fast! We're living an age of increasingly ephemeral servers, what with the AIs and the agents and the clouds and the hoyvin-glavins. Wherever you find yourself, if your database is backed up to object storage with Litestream, you're always in a place where you can quickly issue a query."

The cold-open path:

  1. Open connection with vfs=litestream.
  2. Fetch EOF index trailers for relevant LTX files (~1% of each).
  3. Build in-memory page index.
  4. Serve queries via cache + Range GET on misses.

No full-database download, no replica-seeding delay, no local WAL machinery. This is what makes the VFS a plausible primitive for agentic coding platforms (Phoenix.new is explicitly the kind of consumer cited across the Litestream redesign series) where per-session compute is ephemeral.

Relationship to the rest of Litestream

Opt-in, additive:

"you don't have to use our VFS library to use Litestream, or to get the other benefits of the new LTX code."

Litestream without the VFS is still the 2025-10-02 v0.5.0 system — full database restores, hierarchical compaction, NATS JetStream / S3 / GCS / Azure replica types, etc. The VFS adds a page-granular read path, near-realtime-replica behaviour, and SQL-level PITR; it doesn't replace or require anything else.

Convergence with LiteFS

LiteFS has shipped LiteVFS — its FUSE-free integration surface — since well before 2025. LiteFS-side replicas therefore already had a VFS option. With Litestream VFS shipping, the three-layer stack has a second VFS surface on the Litestream side, using the same on-disk LTX format. The architectural convergence foreshadowed in the 2025-05-20 post is now concrete: both tools share the LTX format and a VFS integration surface.

Seen in

  • sources/2025-05-20-flyio-litestream-revampeddesign post. VFS-based read replicas teased as "what we're doing next", with FUSE-as-usability-wall argument.
  • sources/2025-10-02-flyio-litestream-v050-is-hereshipping post for v0.5.0. VFS explicitly flagged as "we already have a proof of concept working and we're excited to show it off when it's ready!" — not shipped in v0.5.0. The per-page compression + EOF index that v0.5.0 did ship is the structural precondition Litestream VFS needs.
  • sources/2025-12-11-flyio-litestream-vfsship announcement. Canonical wiki introduction of the Litestream VFS system. .load litestream.so + file:///my.db?vfs=litestream activation; page lookup via EOF-index trailers; HTTP Range GET to S3-compatible storage; LRU cache of hot pages; L0 polling for near-realtime replica; PRAGMA litestream_time for SQL-level PITR; read-side-only; opt-in.
Last updated · 200 distilled / 1,178 read