CONCEPT Cited by 1 source
Background hydration¶
Definition¶
A cache-warming / replica-fill technique in which a system serves reads against a remote authoritative source while a background process downloads the full data set to local storage. Once local materialisation completes, reads cut over to the local copy — latency drops, round-trips to the remote source stop.
The defining property: reads never block on hydration. Cold-open serves queries immediately; hydration is a latency-optimisation, not a correctness prerequisite.
Etymology — dm-clone and block-level ancestry¶
The term and the architectural shape come from Linux's dm-clone device-mapper target:
"The
dm-clonetarget allows cloning of arbitrary block devices … while the source device stays read-only … The hydration process runs in the background, cloning the source device's data onto the destination device."
dm-clone serves reads from source-until-cloned,
destination-after-cloned, with a per-block hydration bitmap
tracking progress. Variations of the idea appear across
storage systems under names like lazy replication,
demand-paged fetch, copy-on-read warming.
Canonical wiki instance — Litestream VFS¶
Ben Johnson's 2026-01-29 shipping post:
"To solve this problem, we shoplifted a trick from systems like dm-clone: background hydration. In hydration designs, we serve queries remotely while running a loop to pull the whole database. … Reads don't block on hydration; we serve them from object storage immediately, and switch over to the hydration file when it's ready."
Litestream VFS's specialisation:
- Source: LTX files in object storage (S3-compatible).
- Destination: a local SQLite database file at the
operator-specified
LITESTREAM_HYDRATION_PATH. - Hydrator: a background thread reading LTX files and writing the destination using LTX compaction so the destination file contains "only the latest versions of each page" — not the full LTX history.
- Cutover: once hydration is complete, VFS reads that were previously Range-GETs against object storage transition to local file I/O.
- Lifetime: the hydration file is a temp file discarded on process exit (see below).
Hydration ≠ persistent cache¶
A crucial design distinction: the hydration file is not a persistent cache across process restarts:
"Because this is designed for environments like Sprites, which bounce a lot, we write the database to a temporary file. We can't trust that the database is using the latest state every time we start up, not without doing a full restore, so we just chuck the hydration file when we exit the VFS."
Rationale: a remote writer may have advanced the database between this process's previous shutdown and its next startup; the old hydration file is unsafe without a verification pass equivalent in cost to a fresh hydration. Discard-on-exit is the safe default.
This is the opposite choice from a persistent NVMe cache (concepts/read-through-nvme-cache) which can be trusted across restarts because its cache keys are content- addressed against immutable chunks in object storage. Hydration copies specific current page versions into a file that is not content-addressed against those versions — so it ages into staleness as soon as new writes land upstream.
Contrast with related techniques¶
| Technique | Cold read path | Local storage | Persists across restarts |
|---|---|---|---|
| Background hydration | remote GET | full DB copy | ❌ discarded |
| Read-through cache (concepts/read-through-nvme-cache) | remote GET | per-chunk lazy | ✅ (content-addressed) |
| Restore-before-serve | blocks until DB fully local | full DB copy | ✅ |
| Pure remote-VFS (no local file) | remote GET | none | n/a |
Background hydration picks the cold-open-speed of remote-VFS
and the steady-state-speed of local-DB without the
persist-across-restarts of either. Ideal for ephemeral
server substrates (Sprites, FaaS, short-lived sandbox VMs)
where cold opens are frequent and process lifetimes
bounded; suboptimal for long-lived processes where
amortising one litestream restore across hours of uptime
costs less than re-hydrating on every start.
Why it exists on Litestream VFS¶
The 2025-12-11 read-only VFS shipped a remote-read-only surface: "a godsend in a cold start where we have no other alternative besides downloading the whole database, but it's not fast enough for steady state" (quoting the 2026-01-29 post). Background hydration is the answer to the steady-state problem — cold reads still go remote; hot reads eventually land on local disk as the hydrator catches up.
Motivating consumer: the Fly.io Sprite block-map, "low tens of megabytes" of metadata that must be queryable milliseconds after Sprite boot but should not pay S3 round-trips for the rest of the Sprite's lifetime.
Operational parameters not typically disclosed¶
- Hydration throughput (MB/s from remote source).
- Concurrency (how many Range GETs in flight during hydration).
- Back-pressure against query traffic (does hydration throttle if reads are saturating the network?).
- Progress reporting surface (can the application observe "N% hydrated"?).
- Cutover semantics (how mid-query cuts are handled — serializable cutover point, or per-connection flip?).
Seen in¶
- sources/2026-01-29-flyio-litestream-writable-vfs —
canonical wiki instance. "Shoplifted a trick from
systems like dm-clone."
LITESTREAM_HYDRATION_PATH=...activates the feature; compaction-driven write of latest page versions; reads served remotely until the file is ready; hydration file discarded on exit.
Related¶
- concepts/read-through-nvme-cache — the sibling persistent-cache technique for immutable-chunk substrates.
- concepts/object-storage-as-disk-root — the umbrella durability posture background hydration layers on.
- concepts/sqlite-vfs — the integration surface.
- systems/litestream-vfs — the canonical implementer.
- patterns/background-hydration-to-local-file — the pattern page.
- patterns/vfs-range-get-from-object-store — the always-available cold-path read primitive that hydration layers on top of.