Skip to content

PATTERN Cited by 2 sources

VFS Range GET from object storage

Pattern

Run a database engine "hot off an object storage URL" by intercepting its I/O layer with a VFS / filesystem extension the application links, and resolving each page read via an HTTP Range GET against a remote file in S3-compatible object storage. A small in-process LRU cache fronts the range reads so the hot set doesn't round-trip on every query.

Three layers compose:

SQLite library (unmodified)
    │  Read(page N, offset, size)
VFS / extension (this pattern's layer)
    │  1. Look up page N in local page-index (pre-built from
    │     remote file trailers)
    │  2. If page in LRU cache → return
    │  3. Else HTTP Range GET (remote_file, byte_offset, size)
    │     → populate LRU → return
Object storage (S3 / Tigris / GCS / Azure Blob)
    │  Range: bytes=<offset>-<offset+size-1>
    Returns the bytes of exactly one SQLite page

Canonical instance: Litestream VFS

From the 2025-12-11 shipping post:

"With our VFS loaded, whenever SQLite needs to read a page into memory, it issues a Read() call through our library. The read call includes the byte offset at which SQLite expected to find the page. But with Litestream VFS, that byte offset is an illusion. Instead, we use our knowledge of the page size along with the requested page number to do a lookup on the page index we've built. From it, we get the remote filename, the 'real' byte offset into that file, and the size of the page. That's enough for us to use the S3 API's Range header handling to download exactly the block we want." (Source: sources/2025-12-11-flyio-litestream-vfs)

Litestream VFS realises the pattern against LTX files, using the LTX index trailer (~1% of each LTX file) to pre-build the page-lookup table at cold-open.

Why it works

Three underlying primitives make this cheap:

  1. SQLite VFS — the application's unmodified SQLite library routes all OS-level I/O through a pluggable interface. Intercepting reads is a supported extension point, not a hack.
  2. Random-accessible file format — the backing file format (LTX in Litestream's case) ships with per-page compression + an end-of-file index trailer that maps page-number → byte-offset. Any single page can be located without scanning the file.
  3. Object-storage Range GET — modern S3-compatible stores (including S3, Tigris, GCS, Azure Blob) honour the HTTP Range request header for byte-granular reads. Cost scales with bytes actually read, not file size.

LRU cache exploits the hot-set shape

Page-level reads are only cheap if most reads miss the cache at a low enough rate to keep S3 costs bounded. The post names the SQLite-specific hot-set shape this pattern leverages:

"Most databases have a small set of 'hot' pages — inner branch pages or the leftmost leaf pages for tables with an auto-incrementing ID field. So only a small percentage of the database is updated and queried regularly."

For B-tree databases with bounded working sets (canonical SQLite workload), a modest LRU cache absorbs the traffic that would otherwise become Range GETs.

Pairs with: L0 polling for near-realtime freshness

This pattern gives a read-side primitive — the application can query a remote database without downloading it. To keep reads fresh, pair with patterns/near-realtime-replica-via-l0-polling: the VFS polls the writer's finest-grained upload level (L0, uploaded every second in Litestream's case) and incrementally updates its page index. Together: pay for reads (Range GET on miss) + keep freshness via poll-and-update.

Trade-offs

  • Remote reads are not free. "this approach isn't as efficient as a local SQLite database. That kind of efficiency, where you don't even need to think about N+1 queries because there's no network round-trip, is part of the point of using SQLite." SQLite workloads that assume zero network cost (N+1-heavy query patterns, PRAGMA journal_mode=WAL-style write bursts expected to land locally) can degrade pathologically.
  • Cache miss = HTTP round-trip. p99 per-page read latency is object-storage RTT-bound; unfavourable against in-process page cache.
  • LRU tuning matters. Cache size too small → thrashing; too large → memory overhead defeats the "ephemeral server" premise.
  • File-format precondition is specific. Random-access over remote files requires a file format designed for it. Raw SQLite files aren't random-access at page granularity over HTTP; LTX is because of its per-page compression + EOF index trailer.

When it's the wrong shape

  • Write-heavy workloads. The pattern is read-side-only; writes still need the primary (local SQLite + Litestream or LiteFS on the writer side). Applications dominated by writes get no benefit.
  • Chatty query patterns. ORM-style SELECT * FROM a; for each a: SELECT * FROM b WHERE a_id = … walks hundreds of small pages; each cache miss is an HTTP round-trip. Batch-shape queries fare much better.
  • Strong read-after-write on the same session. If the reader is the writer too (unusual but possible), local-SQLite is strictly better; the VFS introduces unnecessary latency.
  • When FUSE is available and fine. If the deployment surface allows FUSE (bare-metal Linux, K8s with privileged: true), a FUSE-mounted LiteFS replica can give lower-latency reads for a small per-node cost — worth evaluating vs a VFS-range-GET reader.

2026-02-04 extension: writable variant

The read-only shape above is extended by sources/2026-02-04-flyio-litestream-writable-vfs into a single-writer-with-buffered-sync variant gated on LITESTREAM_WRITE_ENABLED=true. The VFS now intercepts writes too: page writes accumulate in a local write buffer, synced to object storage every ~1 s or on clean shutdown. Polling is disabled in this mode (no remote writers assumed). The post is explicit that this variant is not a multi-writer SQLite — "multiple-writer distributed SQLite databases are the Lament Configuration" — and that its durability contract matches Sprites' "eventual durability" rather than strict sync-per-commit.

When to use the writable variant:

  • A single-tenant, single-writer metadata store rooted in object storage (canonical: the Fly Sprites "block map" running JuiceFS metadata on SQLite + Litestream VFS).
  • An application that's already comfortable with bounded-loss durability (~1 s of writes can be lost on unclean shutdown) because the surrounding storage stack already gives that SLO.

When not to use the writable variant:

  • Anything remotely multi-writer. No coordination, no leases at this layer.
  • Applications that require synchronous durability per commit.
  • Applications that don't have a natural single-writer invariant — the VFS assumes it without enforcing it.

The writable variant is the narrow-fit capability the 2026-02-04 post repeatedly warns about: "not as general-purpose as it might look."

Seen in

  • sources/2025-12-11-flyio-litestream-vfs — canonical wiki instance. Litestream VFS ships the composite pattern: .load litestream.so + file:///my.db?vfs=litestream + LTX-trailer- based page index + LRU cache of hot pages + HTTP Range GET on miss.
  • sources/2026-02-04-flyio-litestream-writable-vfspattern extended to writes. LITESTREAM_WRITE_ENABLED=true makes the VFS bidirectional (single-writer only, ~1 s buffered sync to object storage). Production consumer: the Fly Sprites block map, running JuiceFS metadata on this stack.
Last updated · 542 distilled / 1,571 read