PATTERN Cited by 1 source
Background hydration to local file¶
Pattern¶
Serve queries against a remote data source immediately via a slow-but-cold-open-capable read path (e.g. HTTP Range GETs), while a background loop downloads the full dataset into a local file. When the local file is complete, cut reads over to it — faster steady-state latency, no user-visible stall.
Three phases over time:
t=0 open VFS / data source
│
│ Reads → Range GET against object storage (slow-ish)
│ Hydration thread: pulling LTX files, merging into
│ local file using compaction
│
t=T hydration complete
│
│ Reads → local file (fast)
│ Background loop exits
│
t=E process exits
│
│ Local hydration file discarded
Key property: reads never block on hydration. The cold- open path serves queries the moment the VFS is loaded; hydration is a latency-improvement optimisation, not a correctness requirement.
Canonical instance: Litestream VFS¶
From the 2026-01-29 shipping post:
"To solve this problem, we shoplifted a trick from systems like dm-clone: background hydration. In hydration designs, we serve queries remotely while running a loop to pull the whole database. When you start the VFS with the
LITESTREAM_HYDRATION_PATHenvironment variable set, we'll hydrate to that file. Hydration takes advantage of LTX compaction, writing only the latest versions of each page. Reads don't block on hydration; we serve them from object storage immediately, and switch over to the hydration file when it's ready."
Activation: LITESTREAM_HYDRATION_PATH=/path/to/hydrated.db.
Prior art: dm-clone¶
Linux's dm-clone device-mapper target is the architectural template:
- A source device (remote, slow) + a destination device (local, fast).
- Reads are routed to the destination if the destination has that block; otherwise to the source.
- A hydrator runs in the background, copying blocks from source to destination.
- Once fully hydrated, all reads are local.
Litestream VFS's hydration is the page-granular + LTX- lineage specialisation: pages come from LTX files in object storage; the destination is a local SQLite database file; hydration uses compaction so the written file contains "only the latest versions of each page" rather than the full LTX history.
Hydration file is disposable¶
Because hydration writes a copy of the remote state, the safe posture on restart is to discard and re-hydrate — the local file might be stale (remote writer advanced the database between this process's previous exit and the current start). From the post:
"We can't trust that the database is using the latest state every time we start up, not without doing a full restore, so we just chuck the hydration file when we exit the VFS. That behavior is baked into the VFS right now."
Design consequence: hydration is per-process-lifetime, not persistent across restarts. Works fine for ephemeral servers (Sprites, FaaS, short-lived sandbox VMs) where process lifetimes are bounded and cold-opens are frequent; wastes bandwidth on long-running processes that rarely restart.
Why not just run litestream restore?¶
litestream restore is the "download the whole database
before serving a single request" posture — blocking,
serial, request-budget-unfriendly on cold paths. Background
hydration + VFS-backed Range GETs lets the application
start serving queries immediately (the VFS is live before
the download starts) while still getting to "full local
database" steady-state as soon as the background loop
finishes. Two phases, not one.
Concrete cold-boot budget: on a Fly.io Sprite cold-start,
the storage stack must serve writes (not just reads)
within milliseconds of an incoming web request triggering
the Sprite bounce. litestream restore of a "low tens of
megabytes" block map blows that budget; background
hydration + writable VFS together meet it.
Pairs with: writable-VFS mode¶
For read-steady-state performance of an ephemeral-server workload, combine with patterns/writable-vfs-with-buffered-sync:
- Write path: local write buffer + async object-store sync.
- Read path: cold → Range GET; hot → local hydrated file; freshly written → local write buffer.
Both patterns are opt-in features on the same VFS instance (Sprites enable both for their block-map case).
Trade-offs¶
- Costs bandwidth twice on first request: Range GETs pull individual pages; hydrator pulls the full database concurrently. Working-set reads see the same pages twice in the worst case.
- Latency cliff at cutover. Read latency jumps from object-store-RTT to disk-read at the hydration-complete moment; most applications welcome this, but pathological cases (tight latency-SLA-based routing) may see the transition visibly.
- Local file size = full database. Not a cache; not sparse. Hydration target must have capacity for the whole database.
- Not durable. The hydration file is discarded on exit; anything durable lives in object storage.
- No partial hydration API. Applications can't tell the VFS "hydrate these tables only" — it's all-or-nothing.
When it's the wrong shape¶
- Cold-open doesn't matter. Long-running processes
holding a connection for hours / days pay the hydration
cost once; a straightforward
litestream restoreon startup is simpler. - Object storage is in-region and already fast enough. If Range-GET latency is tolerable for steady-state reads, skip hydration; save the local disk space.
- Database size ≫ local disk budget. Hydration requires fitting the full database locally. Large databases on space-constrained hosts need a different approach (e.g. per-table caching).
- Write-heavy workloads where the writer is already local. The writer already has the authoritative database on local disk; background hydration is redundant.
Seen in¶
- sources/2026-01-29-flyio-litestream-writable-vfs —
canonical wiki instance.
LITESTREAM_HYDRATION_PATH=...starts the background-hydration loop; hydration uses LTX compaction to write "only the latest versions of each page"; reads don't block; cutover happens once the file is ready; hydration file is temp-file + discarded on exit. Motivating consumer: Fly.io Sprites' block-map metadata store on Sprite cold-boot.
Related¶
- concepts/sqlite-vfs — the integration surface.
- concepts/ltx-file-format — the on-wire format hydration reads.
- concepts/background-hydration — the underlying concept the pattern realises.
- systems/litestream — the parent system.
- systems/litestream-vfs — the extension that realises this pattern.
- systems/fly-sprites — the motivating consumer.
- patterns/vfs-range-get-from-object-store — the always-available cold-path read primitive hydration layers on.
- patterns/writable-vfs-with-buffered-sync — the complementary write-path primitive.
- patterns/ltx-compaction — the compaction operation hydration reuses to write "latest versions only".