Skip to content

PATTERN Cited by 1 source

Background hydration to local file

Pattern

Serve queries against a remote data source immediately via a slow-but-cold-open-capable read path (e.g. HTTP Range GETs), while a background loop downloads the full dataset into a local file. When the local file is complete, cut reads over to it — faster steady-state latency, no user-visible stall.

Three phases over time:

t=0  open VFS / data source
     │  Reads → Range GET against object storage (slow-ish)
     │  Hydration thread: pulling LTX files, merging into
     │                    local file using compaction
t=T  hydration complete
     │  Reads → local file (fast)
     │  Background loop exits
t=E  process exits
     │  Local hydration file discarded

Key property: reads never block on hydration. The cold- open path serves queries the moment the VFS is loaded; hydration is a latency-improvement optimisation, not a correctness requirement.

Canonical instance: Litestream VFS

From the 2026-01-29 shipping post:

"To solve this problem, we shoplifted a trick from systems like dm-clone: background hydration. In hydration designs, we serve queries remotely while running a loop to pull the whole database. When you start the VFS with the LITESTREAM_HYDRATION_PATH environment variable set, we'll hydrate to that file. Hydration takes advantage of LTX compaction, writing only the latest versions of each page. Reads don't block on hydration; we serve them from object storage immediately, and switch over to the hydration file when it's ready."

(Source: sources/2026-01-29-flyio-litestream-writable-vfs)

Activation: LITESTREAM_HYDRATION_PATH=/path/to/hydrated.db.

Prior art: dm-clone

Linux's dm-clone device-mapper target is the architectural template:

  • A source device (remote, slow) + a destination device (local, fast).
  • Reads are routed to the destination if the destination has that block; otherwise to the source.
  • A hydrator runs in the background, copying blocks from source to destination.
  • Once fully hydrated, all reads are local.

Litestream VFS's hydration is the page-granular + LTX- lineage specialisation: pages come from LTX files in object storage; the destination is a local SQLite database file; hydration uses compaction so the written file contains "only the latest versions of each page" rather than the full LTX history.

Hydration file is disposable

Because hydration writes a copy of the remote state, the safe posture on restart is to discard and re-hydrate — the local file might be stale (remote writer advanced the database between this process's previous exit and the current start). From the post:

"We can't trust that the database is using the latest state every time we start up, not without doing a full restore, so we just chuck the hydration file when we exit the VFS. That behavior is baked into the VFS right now."

Design consequence: hydration is per-process-lifetime, not persistent across restarts. Works fine for ephemeral servers (Sprites, FaaS, short-lived sandbox VMs) where process lifetimes are bounded and cold-opens are frequent; wastes bandwidth on long-running processes that rarely restart.

Why not just run litestream restore?

litestream restore is the "download the whole database before serving a single request" posture — blocking, serial, request-budget-unfriendly on cold paths. Background hydration + VFS-backed Range GETs lets the application start serving queries immediately (the VFS is live before the download starts) while still getting to "full local database" steady-state as soon as the background loop finishes. Two phases, not one.

Concrete cold-boot budget: on a Fly.io Sprite cold-start, the storage stack must serve writes (not just reads) within milliseconds of an incoming web request triggering the Sprite bounce. litestream restore of a "low tens of megabytes" block map blows that budget; background hydration + writable VFS together meet it.

Pairs with: writable-VFS mode

For read-steady-state performance of an ephemeral-server workload, combine with patterns/writable-vfs-with-buffered-sync:

  • Write path: local write buffer + async object-store sync.
  • Read path: cold → Range GET; hot → local hydrated file; freshly written → local write buffer.

Both patterns are opt-in features on the same VFS instance (Sprites enable both for their block-map case).

Trade-offs

  • Costs bandwidth twice on first request: Range GETs pull individual pages; hydrator pulls the full database concurrently. Working-set reads see the same pages twice in the worst case.
  • Latency cliff at cutover. Read latency jumps from object-store-RTT to disk-read at the hydration-complete moment; most applications welcome this, but pathological cases (tight latency-SLA-based routing) may see the transition visibly.
  • Local file size = full database. Not a cache; not sparse. Hydration target must have capacity for the whole database.
  • Not durable. The hydration file is discarded on exit; anything durable lives in object storage.
  • No partial hydration API. Applications can't tell the VFS "hydrate these tables only" — it's all-or-nothing.

When it's the wrong shape

  • Cold-open doesn't matter. Long-running processes holding a connection for hours / days pay the hydration cost once; a straightforward litestream restore on startup is simpler.
  • Object storage is in-region and already fast enough. If Range-GET latency is tolerable for steady-state reads, skip hydration; save the local disk space.
  • Database size ≫ local disk budget. Hydration requires fitting the full database locally. Large databases on space-constrained hosts need a different approach (e.g. per-table caching).
  • Write-heavy workloads where the writer is already local. The writer already has the authoritative database on local disk; background hydration is redundant.

Seen in

  • sources/2026-01-29-flyio-litestream-writable-vfs — canonical wiki instance. LITESTREAM_HYDRATION_PATH=... starts the background-hydration loop; hydration uses LTX compaction to write "only the latest versions of each page"; reads don't block; cutover happens once the file is ready; hydration file is temp-file + discarded on exit. Motivating consumer: Fly.io Sprites' block-map metadata store on Sprite cold-boot.
Last updated · 319 distilled / 1,201 read